.. _related_projects:

=====================================
Related Projects
=====================================

Below is a list of sister-projects, extensions and domain specific packages.

Related Packages
----------------
Other packages useful for data analysis and machine learning.

- `Pandas <http://pandas.pydata.org>`_ Tools for working with heterogeneous and
  columnar data, relational queries, time series and basic statistics.

- `sklearn_pandas <https://github.com/paulgb/sklearn-pandas/>`_ bridge for
  scikit-learn pipelines and pandas data frame with dedicated transformers.

- `Scikit-Learn Laboratory
  <https://skll.readthedocs.org/en/latest/index.html>`_  A command-line
  wrapper around scikit-learn that makes it easy to run machine learning
  experiments with multiple learners and large feature sets.

- `theano <http://deeplearning.net/software/theano/>`_ A CPU/GPU array
  processing framework geared towards deep learning research.

- `Statsmodel <http://statsmodels.sourceforge.net/>`_ Estimating and analysing
  statistical models. More focused on statistical tests and less on prediction
  than scikit-learn.

- `PyMC <http://pymc-devs.github.io/pymc/>`_ Bayesian statistical models and fitting algorithms.

- `sklearn_theano <http://sklearn-theano.github.io/>`_ scikit-learn compatible estimators, transformers, and datasets which use Theano internally


Extensions and Algorithms
-------------------------
Libraries that provide a scikit-learn like interface and can be used with
scikit-learn tools.

- `pylearn2 <http://deeplearning.net/software/pylearn2/>`_ A deep learning and
  neural network library build on theano with scikit-learn like interface.

- `lightning <http://www.mblondel.org/lightning/>`_ Fast state-of-the-art
  linear model solvers (SDCA, AdaGrad, SVRG, SAG, etc...).

- `Seqlearn <https://github.com/larsmans/seqlearn>`_  Sequence classification
  using HMMs or structured perceptron.

- `HMMLearn <https://github.com/hmmlearn/hmmlearn>`_ Implementation of hidden
  markov models that was previously part of scikit-learn.

- `PyStruct <https://pystruct.github.io>`_ General conditional random fields
  and structured prediction.

- `py-earth <https://github.com/jcrudy/py-earth>`_ Multivariate adaptive regression splines

- `sklearn-compiledtrees <https://github.com/ajtulloch/sklearn-compiledtrees/>`_
  Generate a C++ implementation of the predict function for decision trees (and
  ensembles) trained by sklearn. Useful for latency-sensitive production
  environments.

- `lda <https://github.com/ariddell/lda/>`_: Fast implementation of Latent
  Dirichlet Allocation in Cython.

- `Sparse Filtering <https://github.com/jmetzen/sparse-filtering>`_
  Unsupervised feature learning based on sparse-filtering

- `Kernel Regression <https://github.com/jmetzen/kernel_regression>`_
  Implementation of Nadaraya-Watson kernel regression with automatic bandwidth
  selection


Domain Specific Packages
-------------------------
- `scikit-image <http://scikit-image.org/>`_ Image processing and computer vision in python.
- `Natural language toolkit (nltk) <http://www.nltk.org/>`_ Natual language processing and some machine learning.
- `NiLearn <https://nilearn.github.io/>`_ Machine learning for neuro-imaging.
- `AstroML <http://www.astroml.org/>`_  Machine learning for astronomy.
- `MSMBuilder <http://www.msmbuilder.org/>`_  Machine learning for protein conformational dynamics time series.



Snippets and tidbits
---------------------
The `wiki <https://github.com/scikit-learn/scikit-learn/wiki/Third-party-projects-and-code-snippets>`_ has more!