Concepts

SEE ALSO: Glossary

Anaconda Project

An Anaconda Project is a portable encapsulation of your data science assets that includes all the necessary configuration to automate its setup and deployment: packages, file downloads, environment variables and runnable commands services, packages, channels and environment specifications. It includes a folder that contains a configuration file named anaconda-project.yml together with scripts, notebooks and other related files.

Data scientists use Projects to encapsulate their data science work and make it easily portable. A project is usually compressed into a .tar.bz2, .tar.gz or .zip file for sharing and storage.

Anaconda Enterprise uses the Anaconda Project specification format.

SEE ALSO: Projects.

Channels

The locations of the repositories where Anaconda Enterprise looks for packages using the conda package manager. Channels may point to a Cloud repository or a private location on a remote or local repository that you or your organization created. The conda channel command has a default set of channels to search beginning with https://repo.continuum.io/pkgs/ . You may override the default channels, for example to maintain a private or internal channel. In conda commands and in the .condarc file, these default channels are referred to by the channel name “defaults”.

SEE ALSO: Packages and Channels.

Deployments

A deployed Anaconda Project is called a Deployment. You can deploy a Project as an interactive visualization, a live notebook or a machine learning model with an API application.

The software deployment process includes all of the activities that make a software application available for use. When you deploy a project, Anaconda Enterprise finds and builds all of the software dependencies–the programs on which the deployment depends in order to run–and encapsulates them so they are completely self- contained. This allows you to easily share the deployment with others. Everything they need to deploy and run the project is included.

You configure how a project is deployed by adding a run command in the configuration file anaconda-project.yml and selecting the appropriate deployment command.

SEE ALSO: Deployments.

Interactive data applications

Interactive data applications are visualizations with sliders, drop-downs and other widgets that allow users to interact with them. Widgets can drive new computations, update plots and connect to other programmatic functionality.

With Enterprise, you can create and deploy interactive data applications built using popular libraries such as Bokeh and Shiny.

SEE ALSO: The sample projects Shiny example r_shiny_distribution, the Bokeh example weather_statistics and others.

Live notebooks

Data scientists use live notebooks to present their work and share it with others. With Enterprise, you can immediately deploy your notebooks with just a few clicks, so that other users can run them and get updated results on demand.

JupyterLab and the classic Jupyter Notebooks are web applications that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses for Notebooks include:

  • Data cleaning and transformation.
  • Numerical simulation.
  • Statistical modeling.
  • Machine learning.

JupyterLab is the next-generation notebook, and the classic Jupyter Notebooks is the prior generation. Access Notebooks by opening a project, then open JupyterLab and select “Switch to Classic Notebook”.

For more information, see Jupyter.

Enterprise supports both R and Python notebooks. For more information, see the Python notebook example stocks_live_notebook and the R notebook example stocks_live_notebook in the anaconda-enterprise channel.

Packages

Software files and information about the software, such as its name, the specific version and a package description, that are bundled into a file that can be installed and managed by a package manager.

SEE ALSO: Packages.

REST APIs

A common way to operationalize machine learning models is through REST APIs. REST APIs are callable URLs which provide results based on a query. This allows developers to make their applications intelligent without having to write models themselves.

RESTful endpoints are a great way to bridge the gap between the data scientists writing machine learning models and the developers writing end-to-end applications. With Enterprise REST API deployment, applications can request predictions as a service.

For more information on REST APIs, look in the sample projects for the quote_api project, and see the tutorial on creating a REST API.

User scopes

Anaconda Enterprise brings together many types of users. Some users simply need to view output, so users in this role are called viewers.

Those who want to run applications that are already deployed are business analysts.

Users who write and deploy applications are called data scientists.

Those who install, configure and administer Anaconda Enterprise, usually IT administrators, are called administrators.

These roles may overlap, for example, if a user is assigned to an administrative role they may also be referred to as a superuser.

Version control

Anaconda Enterprise uses version control to track changes in your project files and to coordinate the work on those files among multiple users. Enterprise lets you know when files have been changed, so you can choose which version to use.

You can edit and save files locally, and then upload or commit them to the server. Before uploading, Enterprise checks your version against the version on the server. If there are conflicts made by your collaborators, you will see a warning. You can review the changes and either cancel your commit, or choose to override the warning and overwrite your collaborators’ version.

Web applications

A software program that can be run in a browser. Python is the leading open source data science language, and it is also widely used for web development. Developers and data scientists can write intelligent web applications together using Python, improving collaboration. Popular Python web frameworks include Flask and Django.

For more information, look in the Example Projects for the Flask example image_classifier_flask.