Kerberos Configuration

Many Hadoop installations are secured using Kerberos. To authenticate with Kerberos, your system administrator must provide at least one configuration file, normally located at /etc/krb5.conf. You need this file to connect to a Kerberized cluster.

To use the krb5.conf file, add it your universal project settings. Use the anaconda-enterprise-cli tool for this:

anaconda-enterprise-cli spark-config --config /etc/krb5.conf krb5.conf

NOTE: To use Sparkmagic, you must configure a Sparkmagic configuration file. In this case, pass two flags to the previous command:

anaconda-enterprise-cli spark-config --config /etc/krb5.conf krb5.conf \
                  --config /opt/continuum/.sparkmagic/config.json config.json

This creates a yaml file: anaconda-config-files-secret.yaml with the data converted for AE5.

Next, upload the file:

sudo kubectl replace -f anaconda-config-files-secret.yaml

With this in place when new projects are created, /etc/krb5.conf is populated with the appropriate data.

Authenticating

Before you start

Contact your administrator to get your Kerberos principal, which is the combination of your username and security domain.

To perform the authentication, open an environment-based terminal in the interface. This is normally in the Launchers panel, in the bottom row of icons, and is the right-most icon.

When the interface appears, execute this command:

kinit myname@DOMAIN.COM

Replace myname@DOMAIN.COM with the Kerberos principal, the combination of your username and security domain, which was provided to you by your administrator.

Executing the command requires you to enter a password. If there is no error message, authentication has succeeded. You can verify by issuing the klist command. If it responds with some entries, authentication has succeeded.

You can also use a keytab to do this. Upload it to a project and execute a command like this:

kinit myname@DOMAIN.COM -kt mykeytab.keytab

NOTE: Kerberos authentication will lapse after some time, requiring you to repeat the above process. The length of time is determined by your cluster security administration, and on many clusters is set to 24 hours.