10 Top Open Source Artificial Intelligence Tools for Linux

In this post, we shall cover a few of the top, open-source artificial intelligence (AI) tools for the Linux ecosystem. Currently, AI is one of the ever advancing fields in science and technology, with a major focus geared towards building software and hardware to solve every day life challenges in areas such as health care, education, security, manufacturing, banking and so much more.

Suggested Read: 20 Free Open Source Softwares I Found in Year 2015

Below is a list of a number of platforms designed and developed for supporting AI, that you can utilize on Linux and possibly many other operating systems. Remember this list is not arranged in any specific order of interest.

1. Deep Learning For Java (Deeplearning4j)

Deeplearning4j is a commercial grade, open-source, plug and play, distributed deep-learning library for Java and Scala programming languages. It’s designed specifically for business related application, and integrated with Hadoop and Spark on top of distributed CPUs and GPUs.

DL4J is released under the Apache 2.0 license and provides GPU support for scaling on AWS and is adapted for micro-service architecture.

Deeplearning4j - Deep Learning for Java
Deeplearning4j – Deep Learning for Java

Visit Homepage: http://deeplearning4j.org/

2. Caffe – Deep Learning Framework

Caffe is a modular and expressive deep learning framework based on speed. It is released under the BSD 2-Clause license, and it’s already supporting several community projects in areas such as research, startup prototypes, industrial applications in fields such as vision, speech and multimedia.

Caffe - Deep Learning Framework
Caffe – Deep Learning Framework

Visit Homepage: http://caffe.berkeleyvision.org/

3. H20 – Distributed Machine Learning Framework

H20 is an open-source, fast, scalable and distributed machine learning framework, plus the assortment of algorithms equipped on the framework. It supports smarter application such as deep learning, gradient boosting, random forests, generalized linear modeling (I.e logistic regression, Elastic Net) and many more.

It is a businesses oriented artificial intelligence tool for decision making from data, it enables users to draw insights from their data using faster and better predictive modeling.

H2O - Distributed Machine Learning Framework
H2O – Distributed Machine Learning Framework

Visit Homepage: http://www.h2o.ai/

4. MLlib – Machine Learning Library

MLlib is an open-source, easy-to-use and high performance machine learning library developed as part of Apache Spark. It is essentially easy to deploy and can run on existing Hadoop clusters and data.

Suggested Read: 12 Best Open Source Text Editors (GUI + CLI) I Found in 2015

MLlib also ships in with an collection of algorithms for classification, regression, recommendation, clustering, survival analysis and so much more. Importantly, it can be used in Python, Java, Scala and R programming languages.

MLlib - Machine Learning Library
MLlib – Machine Learning Library

Visit Homepage: https://spark.apache.org/mllib/

5. Apache Mahout

Mahout is an open-source framework designed for building scalable machine learning applications, it has three prominent features listed below:

  1. Provides simple and extensible programming workplace
  2. Offers a variety of prepackaged algorithms for Scala + Apache Spark, H20 as well as Apache Flink
  3. Includes Samaras, a vector math experimentation workplace with R-like syntax
Apache Mahout
Apache Mahout

Visit Homepage: http://mahout.apache.org/

6. Open Neural Networks Library (OpenNN)

OpenNN is also an open-source class library written in C++ for deep learning, it is used to instigate neural networks. However, it is only optimal for experienced C++ programmers and persons with tremendous machine learning skills. It’s characterized of a deep architecture and high performance.

OpenNN - Open Neural Networks Library
OpenNN – Open Neural Networks Library

Visit Homepage: http://www.opennn.net/

7. Oryx 2

Oryx 2 is a continuation of the initial Oryx project, it’s developed on Apache Spark and Apache Kafka as a re-architecting of the lambda architecture, although dedicated towards achieving real-time machine learning.

It is a platform for application development and ships in with certain applications as well for collaborative filtering, classification, regression and clustering purposes.

Oryx2 - Re-architecting Lambda Architecture
Oryx2 – Re-architecting Lambda Architecture

Visit Homepage: http://oryx.io/

8. OpenCyc

OpenCyc is an open-source portal to the largest and most comprehensive general knowledge base and commonsense reasoning engine of the world. It includes a large number of Cyc terms arranged in a precisely designed onology for application in areas such as:

  1. Rich domain modeling
  2. Domain-specific expert systems
  3. Text understanding
  4. Semantic data integration as well as AI games plus many more.

Visit Homepage: http://www.cyc.com/platform/opencyc/

9. Apache SystemML

SystemML is open-source artificial intelligence platform for machine learning ideal for big data. Its main features are – runs on R and Python-like syntax, focused on big data and designed specifically for high-level math. How it works is well explained on the homepage, including a video demonstration for clear illustration.

Suggested Read: 18 Best IDEs for C/C++ Programming or Source Code Editors on Linux

There are several ways to use it including Apache Spark, Apache Hadoop, Jupyter and Apache Zeppelin. Some of its notable use cases include automotives, airport traffic and social banking.

Apache SystemML - Machine Learning Platform
Apache SystemML – Machine Learning Platform

Visit Homepage: http://systemml.apache.org/

10. NuPIC

NuPIC is an open-source framework for machine learning that is based on Heirarchical Temporary Memory (HTM), a neocortex theory. The HTM program integrated in NuPIC is implemented for analyzing real-time streaming data, where it learns time-based patterns existing in data, predicts the imminent values as well as reveals any irregularities.

Its notable features include:

  1. Continuous online learning
  2. Temporal and spatial patterns
  3. Real-time streaming data
  4. Prediction and modeling
  5. Powerful anomaly detection
  6. Hierarchical temporal memory
NuPIC Machine Intelligence
NuPIC Machine Intelligence

Visit Homepage: http://numenta.org/

With the rise and ever advancing research in AI, we are bound to witness more tools spring up to help make this area of technology a success especially for solving daily scientific challenges along with educational purposes.

Are you interested in AI, what is your say? Offer us your thoughts, suggestions or any productive feedback about the subject matter via the comment section below and we shall be delighted to know more from your.

If you read this far, tweet to the author to show them you care. Tweet a thanks
Aaron Kili
Aaron Kili is a Linux and F.O.S.S enthusiast, an upcoming Linux SysAdmin, web developer, and currently a content creator for TecMint who loves working with computers and strongly believes in sharing knowledge.

Each tutorial at TecMint is created by a team of experienced Linux system administrators so that it meets our high-quality standards.

Join the TecMint Weekly Newsletter (More Than 156,129 Linux Enthusiasts Have Subscribed)
Was this article helpful? Please add a comment or buy me a coffee to show your appreciation.

6 thoughts on “10 Top Open Source Artificial Intelligence Tools for Linux”

  1. One still needs a master’s degree in coding and data mining in order to use AI. The way it is coded all together is useless for advanced users, software that can be used by advanced users is all commercial and overpriced.

    People need to analyze data and not make a master’s degree in how to use it! There is no AI open source data mining software, software being, commercial or open source features only generative INs that are hard to use due to one can’t guess proper hyperparameters. There are more possibilities than winning the toughest lotto in the world.

    It’s all pre-90s when those base algorithms were developed. All you can get is pieces that u have to code all together by yourself, and then you have to code also what these pieces would do manually and the interconnections between them.

    That’s why you need a master’s degree in coding, mathematics, statistics, and databases. There are also huge drawbacks to how this kind of software handles data inputs (no continuous workflows, not user-friendly, error rich, most of them don’t even take rows into account only columns (but data in rows are also equally relevant as in columns (cross-importance), some even winning on named 1st columns meta names and so on …), poor tuts, not exactly clear how data in the matrix are treated and so on… software is still crap and poorly coded and answers on all those questions one has to empirically finds out by himself just to find out how this isn’t supposed to work.

    So in the end results are devastating useless waste of nerves and time. Try to get promised accuracy on non-linear datasets with various IN shapes near or above 70% for results to be of some use.

    You can only dream getting it on more than 90% as posted all over the net. It’s all lie. I’m in this field for 1 year now and I can only say leave it be if u don’t need it professionally. AI not for end users.

  2. While I enjoyed this article, I would love to see some A.I. suggestions for the average desktop/laptop. (I have tried Almond, but would love something more advanced.) The desktop/laptop area is where I thing A.I.’s could really shine, especially in the virtual assistant type application. Cortana/Siri are fine but still way too primitive, and not open source. (Nor can they be just added to an existing system.)

  3. As an AI developer and enthusiast, this article give a good feedback on several widely used libraries and tools for artificial intelligence. With the ever expanding capability of AI i wonder what will come next for the big data and AI industry

  4. OpenCyc seems to be discontinued:

    “Part of the Cyc technology was released, starting in 2001, as OpenCyc, which provided an API, RDF endpoint, and data dump, under appropriate Apache and Creative Commons open source licenses. Its distribution was discontinued in early 2017 because such “fragmenting” led to divergence, and led to confusion amongst its users and the technical community generally that that OpenCyc fragment was Cyc.”


Got something to say? Join the discussion.

Thank you for taking the time to share your thoughts with us. We appreciate your decision to leave a comment and value your contribution to the discussion. It's important to note that we moderate all comments in accordance with our comment policy to ensure a respectful and constructive conversation.

Rest assured that your email address will remain private and will not be published or shared with anyone. We prioritize the privacy and security of our users.