Differences between revisions 1 and 2

This page attempts to collect information and links pertaining to the practice of AI and Machine Learning in python.

Machine Learning

Orange - Open source data visualization and analysis for novice and experts. Data mining through visual programming or Python scripting. Components for machine learning. Extensions for bioinformatics and text mining. Packed with features for data analytics.
PyBrain - PyBrain is a modular Machine Learning Library for Python. Its goal is to offer flexible, easy-to-use yet still powerful algorithms for Machine Learning Tasks and a variety of predefined environments to test and compare your algorithms.
PyMl - PyML is an interactive object oriented framework for machine learning written in Python. PyML focuses on SVMs and other kernel methods. It is supported on Linux and Mac OS X.
MlPy - mlpy makes extensive use of NumPy to provide fast N-dimensional array manipulation and easy integration of C code. The GNU Scientific Library ( GSL) is also required. It provides high level procedures that support, with few lines of code, the design of rich Data Analysis Protocols (DAPs) for preprocessing, clustering, predictive classification, regression and feature selection. Methods are available for feature weighting and ranking, data resampling, error evaluation and experiment landscaping.
Milk - Milk is a machine learning toolkit in Python. Its focus is on supervised classification with several classifiers available: SVMs (based on libsvm), k-NN, random forests, decision trees. It also performs feature selection. These classifiers can be combined in many ways to form different classification systems.
scikit-learn - scikit-learn is a Python module integrating classic machine learning algorithms in the tightly-knit world of scientific Python packages (numpy, scipy, matplotlib). It aims to provide simple and efficient solutions to learning problems that are accessible to everybody and reusable in various contexts: machine-learning as a versatile tool for science and engineering.
Shogun - The machine learning toolbox's focus is on large scale kernel methods and especially on Support Vector Machines (SVM) . It provides a generic SVM object interfacing to several different SVM implementations, among them the state of the art OCAS, Liblinear, LibSVM, SVMLight, SVMLin and GPDT. Each of the SVMs can be combined with a variety of kernels. The toolbox not only provides efficient implementations of the most common kernels, like the Linear, Polynomial, Gaussian and Sigmoid Kernel but also comes with a number of recent string kernels. SHOGUN is implemented in C++ and interfaces to Matlab(tm), R, Octave and Python and is proudly released as Machine Learning Open Source Software
MDP-Toolkit - Modular toolkit for Data Processing (MDP) is a Python data processing framework. From the user’s perspective, MDP is a collection of supervised and unsupervised learning algorithms and other data processing units that can be combined into data processing sequences and more complex feed-forward network architectures. From the scientific developer’s perspective, MDP is a modular framework, which can easily be expanded. The implementation of new algorithms is easy and intuitive. The new implemented units are then automatically integrated with the rest of the library. The base of available algorithms is steadily increasing and includes signal processing methods (Principal Component Analysis, Independent Component Analysis, Slow Feature Analysis), manifold learning methods ([Hessian] Locally Linear Embedding), several classifiers, probabilistic methods (Factor Analysis, RBM), data pre-processing methods, and many others.

Natural Language & Text Processing

NLTK - Open source Python modules, linguistic data and documentation for research and development in natural language processing and text analytics, with distributions for Windows, Mac OSX and Linux.
gensim - Gensim is a Python framework designed to automatically extract semantic topics from documents, as naturally and painlessly as possible. Gensim aims at processing raw, unstructured digital texts (“plain text”). The unsupervised algorithms in gensim, such as Latent Semantic Analysis, Latent Dirichlet Allocation or Random Projections, discover hidden (latent) semantic structure, based on word co-occurrence patterns within a corpus of training documents. Once these statistical patterns are found, any plain text documents can be succinctly expressed in the new, semantic representation, and queried for topical similarity against other documents and so on.

PythonForArtificialIntelligence (last edited 2016-01-21 16:58:05 by DennisAtabay)

-  ⇤ ← Revision 1 as of 2011-10-16 10:25:15 → 
  Size: 5061
  Editor: www
  Comment: added page to cover AI and/or machine learning libs and toolkits in python
+   ← Revision 2 as of 2011-10-16 10:26:00 → ⇥
  Size: 5057
  Editor: www
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 17:
- * '''[[http://scikit-learn.sourceforge.net/|]scikit-learn]''' -  scikit-learn is a Python module integrating classic machine learning algorithms in the tightly-knit world of scientific Python packages (numpy, scipy, matplotlib). It aims to provide simple and efficient solutions to learning problems that are accessible to everybody and reusable in various contexts: machine-learning as a versatile tool for science and engineering.
+ * '''[[http://scikit-learn.sourceforge.net/|scikit-learn]]''' -  scikit-learn is a Python module integrating classic machine learning algorithms in the tightly-knit world of scientific Python packages (numpy, scipy, matplotlib). It aims to provide simple and efficient solutions to learning problems that are accessible to everybody and reusable in various contexts: machine-learning as a versatile tool for science and engineering.
 Line 29:

Page

User

Machine Learning

Natural Language & Text Processing