Storage Overview

Anaconda Enterprise includes an internal database, git server, and object storage server. All persistent storage is written to disk on the master node.

When projects or deployments run in Enterprise, the storage is ephemeral within containers. This means that when the editor or deployment is terminated, either intentionally or unintentionally, the data is not persisted in the container. To persist data, we mount disk storage from the underlying host into specific containers.

The default location for all storage-related assets including the database, project, and package storage in Enterprise is /opt/anaconda/. This location must be backed up frequently or located on a redundant disk array. Refer to the system requirements page for recommended disk space on the master and worker nodes.

The following outlines the storage configuration for Enterprise.

Database Storage

Enterprise stores state related to authentication/authorization, deployments, editor sessions, packages, projects, users, and other information.

We include an internal database server that writes to /opt/anaconda/storage/pgdata on the master node.

Git Storage

File storage for projects is backed by git.

We include an internal git server that writes to /opt/anaconda/storage/git on the master node.

Object Storage

The object storage is used for storing conda packages, Anaconda installers, custom Anaconda parcels for Cloudera CDH, and custom Anaconda management packs for Hortonworks HDP.

We include an internal object storage server that writes to /opt/anaconda/storage/object on the master node via an S3-compatible interface.

NFS Storage

Anaconda Enterprise supports an NFS storage option for the Object Storage Service.

To start, configure a PersistentVolume and a PersistentVolumeClaim for NFS on Kubernetes. The settings below configure an NFS drive with 300 GB and allocate 100GB for usage.

NFS Kube Setup

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-storage-remote
  labels:
    volume: nfs-storage-remote
spec:
  capacity:
    storage: 300Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  nfs:
    server: NFS_HOST_IP
    path: "/var/nfs"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-volume-remote
spec:
  selector:
    matchLabels:
      volume: nfs-storage-remote
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Gi

Object Storage Config

Re-configure the Kubernetes deployment spec for Object Storage and use the newly created PersistentVolumeClaim in ap-object-storage.yaml (deployment configuration):

volumeMounts:
  - mountPath: /export
    name: nfs-volume
volumes:
- name: nfs-volume
  persistentVolumeClaim:
    claimName: nfs-volume-remote

Add in the NFS PersistentVolumeClaim and then restart the object storage pod. NFS paths will now be mounted inside the object-storage service.