Mirroring the Anaconda repository

You can create a local copy of the Anaconda repository. The mirror can be complete, partial, or include or exclude specific packages or types of packages. You can also create a mirror in an air gapped environment.

This page explains how to use Anaconda Enterprise’s convenient syncing tools to create and configure local mirrors for Anaconda’s Python and R packages.

NOTE: It can take hours to mirror the full repository.

Before you begin

You need to have completed installing and configuring Anaconda Enterprise.

  1. Install anaconda-enterprise-cli

    The Anaconda Enterprise installer tarball contains a cas-mirror-VERSION.sh script, which is a custom Miniconda installation that contains the anaconda-enterprise-cli package.

    NOTE: bzip is required to install packages from Miniconda.

    Navigate to the directory where you downloaded and extracted the Anaconda Enterprise installer, then install the bootstrap Miniconda environment to ~/cas-mirror:

    $ cd anaconda-enterprise
    $ ./cas_mirror-5.x.x.x-linux-64.sh
    Welcome to cas_mirror <anaconda-enterprise-installer_version>
    [...]
    

    NOTE: Replace “5.x.x.x” with your actual version number.

    The installer prompts “In order to continue the installation process, please review the license agreement.” Click Enter to view license terms. Scroll to the bottom of the license terms and enter “Yes” to agree.

    The installer prompts you to click Enter to accept the default install location, CTRL-C to cancel the installation, or specify an alternate installation directory. We recommend that you accept the default install location.

    The installer prompts “Do you wish the installer to prepend the cas_mirror install location to PATH in your /home/centos/.bashrc ?”. We recommend “yes”.

    The installer finishes and displays “Thank you for installing cas_mirror!”. Close and open your terminal window for the installation to take effect.

    In your new terminal window, activate the custom Miniconda environment with the following command:

    source ~/cas-mirror/bin/activate
    
  2. Configure Anaconda URL

    Configure the Anaconda URL using the following commands:

    anaconda-enterprise-cli config set sites.master.url https://anaconda.example.com:30089/api
    anaconda-enterprise-cli config set default_site master
    

    NOTE: Always replace anaconda.example.com with the domain name you are using.

  3. Verify and add SSL certificates

    If the root CA is contained in the certificate bundle at /etc/pki/tls/certs/ca-bundle.crt, use openssl to verify the certificates and make sure the final Verify return code is 0:

    openssl s_client -connect anaconda.example.com:30089 -CAfile /etc/pki/tls/certs/ca-bundle.crt
    ...
        Verify return code: 0 (ok)
    

    If you are using privately signed certificates, extract the rootca, then use openssl to verify the certificates and make sure the final Verify return code is 0:

    openssl s_client -connect anaconda.example.com:30089 -CAfile rootca.crt
    ...
        Verify return code: 0 (ok)
    

    Configure the SSL certificates for the repository using the following commands:

    $ anaconda-enterprise-cli config set ssl_verify true
    
    # On Ubuntu
    $ anaconda-enterprise-cli config set sites.master.ssl_verify /etc/ssl/certs/ca-certificates.crt
    
    # On CentOS/RHEL
    $ anaconda-enterprise-cli config set sites.master.ssl_verify /etc/pki/tls/certs/ca-bundle.crt
    

    NOTE: If you are using a self-signed certificate or a certificate signed by a private CA, extract the rootca. Then either:

    • Use the anaconda-enterprise-cli config set sites.master.ssl_verify command to add that root certificate, or
    • Add that root certificate to the default Ubuntu or CentOS/RedHat trusted CA bundles.
  4. Log into Anaconda Enterprise as an existing user using the following command:

    $ anaconda-enterprise-cli login
    Username: anaconda-enterprise
    Password:
    Logged anaconda-enterprise in!
    

NOTE: If Anaconda Enterprise 5 is installed in a proxied environment, see Mirroring in an environment with a proxy for information on setting the NO_PROXY variable.

Extracting self-signed SSL certificates

You may need the temporary self-signed Anaconda Enterprise certificates for later use. For example, when installing the anaconda-enterprise-cli tool, you will need to configure it to point to the self-signed certificate authority.

First, enter the Anaconda Enterprise environment:

sudo gravity enter

Then, run the below command for each certificate file you wish to extract, replacing rootca.crt below with the name of the specific file:

kubectl get secrets certs -o go-template='{{index .data "rootca.crt"}}' | base64 -d > /ext/share/rootca.crt

Once you have run this command, the file will be available on the master node filesystem at /var/lib/gravity/planet/share/<filename>.

The available certificate files are:
  • rootca.crt: the root certificate authority bundle
  • server.crt: the SSL certificate for individual services
  • server.key: the private key for the above certificate
  • wildcard.crt: the SSL certificate for “wildcard” services, such as deployed apps and spaces
  • wildcard.key: the private key for the above certificate
  • keystore.jks: the Java Key Store containing these certificates used by some services

Mirror Anaconda

An example configuration file is provided here for mirroring the default Anaconda packages for the linux-64 platform. These files are also included in the mirror tool installation:

# This is destination channel of mirrored packages on your local repository.
dest_channel: anaconda

# conda packages from these channels are mirrored to dest_channel on your local repository.
channels:
  - https://repo.continuum.io/pkgs/main/
  - https://repo.continuum.io/pkgs/free/
  - https://repo.continuum.io/pkgs/pro/

# if doing a mirror from an airgap tarball, the channels should point to the tarball:
# channels:
#   - file:///path-to-expanded-tarball/repo-mirrors-<date>/anaconda-suite/pkgs/

# Only conda packages of these platforms are mirrored.
# Omitting this will mirror packages for all platforms available on specified channels.
# If the repository will only be used to install packages on the v5 system, it only needs linux-64 packages.
platforms:
  - linux-64

Mirror the contents of the repository:

cas-sync-api-v5 --file ~/cas-mirror/etc/anaconda-platform/mirrors/anaconda.yaml

This mirrors all of the packages from the Anaconda repository into the anaconda channel. If the channel does not already exist, it will be automatically created and shared with all authenticated users.

You can customize the permissions on the mirrored packages by sharing the channel.

Verify in your browser by logging into your account and navigating to the Packages tab. You should see a list of the mirrored packages.

Mirror R packages

An example configuration file for R packages is also provided:

# This is destination channel of mirrored packages on your local repository.
dest_channel: r

# conda packages from these channels are mirrored to dest_channel on your local repository.
channels:
  - https://repo.continuum.io/pkgs/r/

# if doing a mirror from an airgap tarball, the channels should point to the tarball:
# channels:
#   - file:///path-to-expanded-tarball/repo-mirrors-<date>/r/pkgs/

# Only conda packages of these platforms are mirrored.
# Omitting this will mirror packages for all platforms available on specified channels.
# If the repository will only be used to install packages on the v5 system, it only needs linux-64 packages.
platforms:
  - linux-64
cas-sync-api-v5 --file ~/cas-mirror/etc/anaconda-platform/mirrors/r.yaml

Configure conda

After creating the mirror, configure Anaconda Enterprise to add this new mirrored channel to the default channels. This will make the packages available to users using the project editing and deployment features. You can do this by editing your Anaconda Enterprise configuration to include the appropriate channel:

conda:
  channels:
  - defaults
  default-channels:
  - anaconda
  - r
  channel-alias: https://<anaconda.example.com>:30089/conda

NOTE: Replace <anaconda.example.com> with the actual URL to your installation of Anaconda Enterprise.

NOTE: The ap-spaces service must be restarted for the configuration change to take effect on new project editor sessions.

Share channels

To make your new channels visible to your users in the web interface Packages list, share the channels with them.

EXAMPLE: To share new channels “anaconda” and “r” with group ‘everyone’ for read access:

anaconda-enterprise-cli channels share --group everyone --level r anaconda
anaconda-enterprise-cli channels share --group everyone --level r r

After running the share command, verify by logging onto the user interface and viewing the Packages list.

SEE ALSO: Creating and sharing channels

Partial mirror

Alternately, you may not wish to mirror all packages. You can specify which platforms you want to include, or use the whitelist, blacklist or license_blacklist functionality to control which packages are mirrored, by editing the provided mirror files:

cas-sync-api-v5 --file ~/my-custom-anaconda.yaml

In an air-gapped environment

To mirror the repository in a system with no internet access, create a local copy of the repository using a USB drive provided by Anaconda, and point cas-sync-api-v5 to the extracted tarball.

First, mount the USB drive and extract the tarball. In this example we will extract to /tmp:

cd /tmp
tar xvf <path to>/mirror.tar

NOTE: Replace <path to> with the actual path to the mirror file.

Now you have a local file-system repository located at /tmp/mirror/pkgs. You can mirror this repository. Edit /etc/anaconda-platform/mirrors/anaconda.yaml to contain:

channels:
  - /tmp/mirror/pkgs

And then run the command:

cas-sync-api-v5 --file etc/anaconda-platform/mirrors/conda.yaml

This mirrors the contents of the local file-system repository to your Anaconda Enterprise installation under the username ‘anaconda.’

Mirror configuration options

remote_url

Specifies the remote URL from which the conda packages and the Anaconda and Miniconda installers are downloaded. The default value is: https://repo.continuum.io/.

channels

Specifies the remote channels from which conda packages are downloaded. The default is a list of the channels <remote_url>/pkgs/free/ and <remote_url>/pkgs/pro/

All specification information should be included in the same file, and can be passed to the cas-sync-api-v5 command via the --file argument:

cas-sync-api-v5 --file ~/cas-mirror/etc/anaconda-platform/mirrors/anaconda.yaml

destination channel

The configuration option dest_channel specifies where files will be uploaded. The default value is: anaconda

SSL verification

The mirroring tool uses two different settings for configuring SSL verification. When the mirroring tool connects to its destination, it uses the ssl_verify setting from anaconda-enterprise-cli to determine how to validate certificates. For example, to use a custom certificate authority:

anaconda-enterprise-cli config set sites.master.ssl_verify /etc/ssl/certs/ca-certificates.crt

The mirroring tool uses conda’s configuration to determine how to validate certificates when connecting to the source that it is pulling packages from. For example, to disable certificate validation when connecting to the source:

conda config --set ssl_verify false

Mirroring in an environment with a proxy

If Anaconda Enterprise 5 is installed in a proxied environment, set the NO_PROXY variable. This ensures the mirroring tool does not use the proxy when communicating with the repository service, and prevents errors such as “Max retries exceeded”, “Cannot connect to proxy”, and “Tunnel connection failed: 503 Service Unavailable”.

export NO_PROXY=<master-node-domain-name>

Platform-specific mirroring

By default, the cas-sync-api-v5 tool mirrors all platforms. If you do not need all platforms, edit the YAML file to specify the platform(s) you want mirrored:

platforms:
  - linux-64
  - win-32

Package-specific mirroring

In some cases you may want to mirror only a small subset of the repository. Rather than blacklisting a long list of packages you do not want mirrored, you can instead simply enumerate the list of packages you DO want mirrored.

NOTE: This argument cannot be used with the blacklist, whitelist or license_blacklist arguments. It can be used with the platform-specific argument.

EXAMPLE:

pkg_list:
  - accelerate
  - pyqt
  - zope

This example mirrors only the three packages: Accelerate, PyQt & Zope. All other packages will be completely ignored.

Python version-specific mirroring

Mirror the repository with a Python version or versions specified.

EXAMPLE:

python_versions:
  - 3.3

Mirrors only Anaconda packages built for Python 3.3.

License blacklist mirroring

The mirroring script supports license blacklisting for the following license families:

AGPL
GPL2
GPL3
LGPL
BSD
MIT
Apache
PSF
Public-Domain
Proprietary
Other

EXAMPLE:

license_blacklist:
  - GPL2
  - GPL3
  - BSD

This example mirrors all the packages in the repository EXCEPT those that are GPL2-, GPL3-, or BSD-licensed, because those three licenses have been blacklisted.

Blacklist mirroring

The blacklist allows access to all packages EXCEPT those explicitly listed.

EXAMPLE:

blacklist:
  - bzip2
  - tk
  - openssl

This example mirrors the entire repository except the bzip2, Tk, and OpenSSL packages.

Whitelist mirroring

The whitelist functions in combination with either the license_blacklist or blacklist arguments, and re-adds packages that were excluded by a previous argument.

EXAMPLE:

license_blacklist:
  - GPL2
  - GPL3
whitelist:
  - readline

This example mirrors the entire repository EXCEPT any GPL2- or GPL3-licenses packages, but including readline, despite the fact that it is GPL3-licensed.

Combining multiple mirror configurations

You may find that combining two or more of the arguments above is the easiest way to get the exact combination of packages that you want.

The platform argument is evaluated before any other argument.

EXAMPLE: This example mirrors only Linux-64 distributions of the dnspython, Shapely and GDAL packages:

platforms:
  - linux-64
pkg_list:
  - dnspython
  - shapely
  - gdal

If the license_blacklist and blacklist arguments are combined, the license_blacklist is evaluated first, and the blacklist is a supplemental modifier.

EXAMPLE: In this example, the mirror configuration does not mirror GPL2-licensed packages. It does not mirror the GPL3 licensed package pyqt because it has been blacklisted. It does mirror all other packages in the repository:

license_blacklist:
  - GPL2
blacklist:
  - pyqt

If the blacklist and whitelist arguments are both employed, the blacklist is evaluated first, with the whitelist functioning as a modifier.

EXAMPLE: This example mirrors all packages in the repository except astropy and pygments. Despite being listed on the blacklist, accelerate is mirrored because it is listed on the whitelist.

blacklist:
  - accelerate
  - astropy
  - pygments
whitelist:
  - accelerate