.. _related_projects: ===================================== Related Projects ===================================== Below is a list of sister-projects, extensions and domain specific packages. Related Packages ---------------- Other packages useful for data analysis and machine learning. - `Pandas <http://pandas.pydata.org>`_ Tools for working with heterogeneous and columnar data, relational queries, time series and basic statistics. - `sklearn_pandas <https://github.com/paulgb/sklearn-pandas/>`_ bridge for scikit-learn pipelines and pandas data frame with dedicated transformers. - `Scikit-Learn Laboratory <https://skll.readthedocs.org/en/latest/index.html>`_ A command-line wrapper around scikit-learn that makes it easy to run machine learning experiments with multiple learners and large feature sets. - `theano <http://deeplearning.net/software/theano/>`_ A CPU/GPU array processing framework geared towards deep learning research. - `Statsmodel <http://statsmodels.sourceforge.net/>`_ Estimating and analysing statistical models. More focused on statistical tests and less on prediction than scikit-learn. - `PyMC <http://pymc-devs.github.io/pymc/>`_ Bayesian statistical models and fitting algorithms. - `sklearn_theano <http://sklearn-theano.github.io/>`_ scikit-learn compatible estimators, transformers, and datasets which use Theano internally Extensions and Algorithms ------------------------- Libraries that provide a scikit-learn like interface and can be used with scikit-learn tools. - `pylearn2 <http://deeplearning.net/software/pylearn2/>`_ A deep learning and neural network library build on theano with scikit-learn like interface. - `lightning <http://www.mblondel.org/lightning/>`_ Fast state-of-the-art linear model solvers (SDCA, AdaGrad, SVRG, SAG, etc...). - `Seqlearn <https://github.com/larsmans/seqlearn>`_ Sequence classification using HMMs or structured perceptron. - `HMMLearn <https://github.com/hmmlearn/hmmlearn>`_ Implementation of hidden markov models that was previously part of scikit-learn. - `PyStruct <https://pystruct.github.io>`_ General conditional random fields and structured prediction. - `py-earth <https://github.com/jcrudy/py-earth>`_ Multivariate adaptive regression splines - `sklearn-compiledtrees <https://github.com/ajtulloch/sklearn-compiledtrees/>`_ Generate a C++ implementation of the predict function for decision trees (and ensembles) trained by sklearn. Useful for latency-sensitive production environments. - `lda <https://github.com/ariddell/lda/>`_: Fast implementation of Latent Dirichlet Allocation in Cython. - `Sparse Filtering <https://github.com/jmetzen/sparse-filtering>`_ Unsupervised feature learning based on sparse-filtering - `Kernel Regression <https://github.com/jmetzen/kernel_regression>`_ Implementation of Nadaraya-Watson kernel regression with automatic bandwidth selection Domain Specific Packages ------------------------- - `scikit-image <http://scikit-image.org/>`_ Image processing and computer vision in python. - `Natural language toolkit (nltk) <http://www.nltk.org/>`_ Natual language processing and some machine learning. - `NiLearn <https://nilearn.github.io/>`_ Machine learning for neuro-imaging. - `AstroML <http://www.astroml.org/>`_ Machine learning for astronomy. - `MSMBuilder <http://www.msmbuilder.org/>`_ Machine learning for protein conformational dynamics time series. Snippets and tidbits --------------------- The `wiki <https://github.com/scikit-learn/scikit-learn/wiki/Third-party-projects-and-code-snippets>`_ has more!