Yellowbrick: Machine Learning Visualization

_images/visualizers.png

Yellowbrick is a suite of visual diagnostic tools called “Visualizers” that extend the Scikit-Learn API to allow human steering of the model selection process. In a nutshell, Yellowbrick combines Scikit-Learn with Matplotlib in the best tradition of the Scikit-Learn documentation, but to produce visualizations for your models! For more on Yellowbrick, please see the About.

If you’re new to Yellowbrick, checkout the Quick Start or skip ahead to the Model Selection Tutorial. Yellowbrick is a rich library with many Visualizers being added on a regular basis. For details on specific Visualizers and extended usage head over to the Visualizers and API. Interested in contributing to Yellowbrick? Checkout the contributing guide . If you’ve signed up to do user testing, head over to the User Testing Instructions (and thank you!).

Visualizers

Visualizers are estimators (objects that learn from data) whose primary objective is to create visualizations that allow insight into the model selection process. In Scikit-Learn terms, they can be similar to transformers when visualizing the data space or wrap an model estimator similar to how the “ModelCV” (e.g. RidgeCV, LassoCV) methods work. The primary goal of Yellowbrick is to create a sensical API similar to Scikit-Learn. Some of our most popular visualizers include:

Feature Visualization

Classification Visualization

Regression Visualization

Clustering Visualization

  • K-Elbow Plot: select k using the elbow method and various metrics
  • Silhouette Plot: select k by visualizing silhouette coefficient values

Text Visualization

… and more! Visualizers are being added all the time; be sure to check the examples (or even the develop branch) and feel free to contribute your ideas for new Visualizers!

Getting Help

Yellowbrick is a welcoming, inclusive project in the tradition of Matplotlib and Scikit-Learn. Similar to those projects, we try to follow the Python Software Foundation Code of Conduct. Please don’t hesitate to reach out to us for help or if you have any contributions or bugs to report!

The primary way to ask for help with Yellowbrick is to post on our Google Groups Listserv. This is an email list/forum that members of the community can join and respond to each other; you should be able to receive the quickest response here. Please also consider joining the group so you can respond to questions! You can also ask questions on Stack Overflow and tag them with “yellowbrick”. Or you can add issues on GitHub. You can also tweet or direct message us on Twitter @DistrictDataLab.

Open Source

The Yellowbrick license is an open source Apache 2.0 license. Yellowbrick enjoys a very active developer community; please consider joining them and contributing!

Yellowbrick is hosted on GitHub. The issues and pull requests are tracked there.

Indices and tables