Quickstart guide¶

This guide will help you to setup vanilla Hadoop cluster using Sahara REST API v1.0.

1. Install Sahara¶

If you want to hack the code follow Setting Up a Development Environment.
If you just want to install and use Sahara follow Sahara Installation Guide.

2. Keystone endpoints setup¶

To use CLI tools, such as OpenStack’s python clients, we should specify environment variables with addresses and credentials. Let’s mind that we have keystone at 127.0.0.1:5000 with tenant admin, credentials admin:nova and Sahara API at 127.0.0.1:8386. Here is a list of commands to set env:

$ export OS_AUTH_URL=http://127.0.0.1:5000/v2.0/
$ export OS_TENANT_NAME=admin
$ export OS_USERNAME=admin
$ export OS_PASSWORD=nova

You can append these lines to the .bashrc and execute source .bashrc. Now you can get authentication token from OpenStack Keystone service.

$ keystone token-get

If authentication succeed, output will be as follows:

+-----------+----------------------------------+
|  Property |              Value               |
+-----------+----------------------------------+
|  expires  |       2013-07-08T15:21:18Z       |
|     id    | dd92e3cdb4e1462690cd444d6b01b746 |
| tenant_id | 62bd2046841e4e94a87b4a22aa886c13 |
|  user_id  | 720fb87141a14fd0b204f977f5f02512 |
+-----------+----------------------------------+

Save tenant_id which is obviously your Tenant ID and id which is your authentication token (X-Auth-Token):

$ export AUTH_TOKEN="dd92e3cdb4e1462690cd444d6b01b746"
$ export TENANT_ID="62bd2046841e4e94a87b4a22aa886c13"

3. Upload image to Glance¶

You can download pre-built images with vanilla Apache Hadoop or build this images yourself:

Download and install pre-built image with Ubuntu 13.10

$ ssh user@hostname
$ wget http://sahara-files.mirantis.com/sahara-icehouse-vanilla-1.2.1-ubuntu-13.10.qcow2
$ glance image-create --name=sahara-icehouse-vanilla-1.2.1-ubuntu-13.10 \
  --disk-format=qcow2 --container-format=bare < ./sahara-icehouse-vanilla-1.2.1-ubuntu-13.10.qcow2

OR with Fedora 20

$ ssh user@hostname
$ wget http://sahara-files.mirantis.com/sahara-icehouse-vanilla-1.2.1-fedora-20.qcow2
$ glance image-create --name=sahara-icehouse-vanilla-1.2.1-fedora-20 \
  --disk-format=qcow2 --container-format=bare < ./sahara-icehouse-vanilla-1.2.1-fedora-20.qcow2

OR build image using Building Images for Vanilla Plugin.

Save image id. You can get image id from command glance image-list:

$ glance image-list --name sahara-icehouse-vanilla-1.2.1-ubuntu-13.10
+--------------------------------------+---------------------------------------------+
| ID                                   | Name                                        |
+--------------------------------------+---------------------------------------------+
| 3f9fc974-b484-4756-82a4-bff9e116919b | sahara-icehouse-vanilla-1.2.1-ubuntu-13.10  |
+--------------------------------------+---------------------------------------------+

$ export IMAGE_ID="3f9fc974-b484-4756-82a4-bff9e116919b"

4. Register image in Image Registry¶

Now we will actually start to interact with Sahara.

$ export SAHARA_URL="http://localhost:8386/v1.0/$TENANT_ID"

Install httpie REST client

$ sudo pip install httpie

Send POST request to Sahara API to register image with username ubuntu.

$ http POST $SAHARA_URL/images/$IMAGE_ID X-Auth-Token:$AUTH_TOKEN \
 username=ubuntu

Tag the image:

$ http $SAHARA_URL/images/$IMAGE_ID/tag X-Auth-Token:$AUTH_TOKEN \
 tags:='["vanilla", "1.2.1", "ubuntu"]'

Make sure that image is registered correctly:

$ http $SAHARA_URL/images X-Auth-Token:$AUTH_TOKEN

Output should look like:

{
    "images": [
        {
            "OS-EXT-IMG-SIZE:size": 550744576,
            "created": "2013-07-07T15:18:50Z",
            "description": "None",
            "id": "3f9fc974-b484-4756-82a4-bff9e116919b",
            "metadata": {
                "_sahara_description": "None",
                "_sahara_tag_1.2.1": "True",
                "_sahara_tag_ubuntu": "True",
                "_sahara_tag_vanilla": "True",
                "_sahara_username": "ubuntu"
            },
            "minDisk": 0,
            "minRam": 0,
            "name": "sahara-icehouse-vanilla-1.2.1-ubuntu-13.10",
            "progress": 100,
            "status": "ACTIVE",
            "tags": [
                "vanilla",
                "ubuntu",
                "1.2.1"
            ],
            "updated": "2013-07-07T16:25:19Z",
            "username": "ubuntu"
        }
    ]
}

5. Setup NodeGroup templates¶

Create file with name ng_master_template_create.json and fill it with the following content:

{
    "name": "test-master-tmpl",
    "flavor_id": "2",
    "plugin_name": "vanilla",
    "hadoop_version": "1.2.1",
    "node_processes": ["jobtracker", "namenode"]
}

Create file with name ng_worker_template_create.json and fill it with the following content:

{
    "name": "test-worker-tmpl",
    "flavor_id": "2",
    "plugin_name": "vanilla",
    "hadoop_version": "1.2.1",
    "node_processes": ["tasktracker", "datanode"]
}

Send POST requests to Sahara API to upload NodeGroup templates:

$ http $SAHARA_URL/node-group-templates X-Auth-Token:$AUTH_TOKEN \
 < ng_master_template_create.json

$ http $SAHARA_URL/node-group-templates X-Auth-Token:$AUTH_TOKEN \
 < ng_worker_template_create.json

You can list available NodeGroup templates by sending the following request to Sahara API:

$ http $SAHARA_URL/node-group-templates X-Auth-Token:$AUTH_TOKEN

Output should look like:

{
    "node_group_templates": [
        {
            "created": "2013-07-07T18:53:55",
            "flavor_id": "2",
            "hadoop_version": "1.2.1",
            "id": "b38227dc-64fe-42bf-8792-d1456b453ef3",
            "name": "demo-master",
            "node_configs": {},
            "node_processes": [
                "jobtracker",
                "namenode"
            ],
            "plugin_name": "vanilla",
            "updated": "2013-07-07T18:53:55",
            "volume_mount_prefix": "/volumes/disk",
            "volumes_per_node": 0,
            "volumes_size": 10
        },
        {
            "created": "2013-07-07T18:54:00",
            "flavor_id": "2",
            "hadoop_version": "1.2.1",
            "id": "634827b9-6a18-4837-ae15-5371d6ecf02c",
            "name": "demo-worker",
            "node_configs": {},
            "node_processes": [
                "tasktracker",
                "datanode"
            ],
            "plugin_name": "vanilla",
            "updated": "2013-07-07T18:54:00",
            "volume_mount_prefix": "/volumes/disk",
            "volumes_per_node": 0,
            "volumes_size": 10
        }
    ]
}

Save id for the master and worker NodeGroup templates. For example:

Master NodeGroup template id: b38227dc-64fe-42bf-8792-d1456b453ef3
Worker NodeGroup template id: 634827b9-6a18-4837-ae15-5371d6ecf02c

6. Setup Cluster Template¶

Create file with name cluster_template_create.json and fill it with the following content:

{
    "name": "demo-cluster-template",
    "plugin_name": "vanilla",
    "hadoop_version": "1.2.1",
    "node_groups": [
        {
            "name": "master",
            "node_group_template_id": "b1ac3f04-c67f-445f-b06c-fb722736ccc6",
            "count": 1
        },
        {
            "name": "workers",
            "node_group_template_id": "dbc6147e-4020-4695-8b5d-04f2efa978c5",
            "count": 2
        }
    ]
}

Send POST request to Sahara API to upload Cluster template:

$ http $SAHARA_URL/cluster-templates X-Auth-Token:$AUTH_TOKEN \
 < cluster_template_create.json

Save template id. For example ce897df2-1610-4caa-bdb8-408ef90561cf.

7. Create cluster¶

Create file with name cluster_create.json and fill it with the following content:

{
    "name": "cluster-1",
    "plugin_name": "vanilla",
    "hadoop_version": "1.2.1",
    "cluster_template_id" : "ce897df2-1610-4caa-bdb8-408ef90561cf",
    "user_keypair_id": "stack",
    "default_image_id": "3f9fc974-b484-4756-82a4-bff9e116919b"
}

There is a parameter user_keypair_id with value stack. You can create your own keypair in in Horizon UI, or using the command line client:

nova keypair-add stack --pub-key $PATH_TO_PUBLIC_KEY

Send POST request to Sahara API to create and start the cluster:

$ http $SAHARA_URL/clusters X-Auth-Token:$AUTH_TOKEN \
 < cluster_create.json

Once cluster started, you’ll get similar output:

{
    "clusters": [
        {
            "anti_affinity": [],
            "cluster_configs": {},
            "cluster_template_id": "ce897df2-1610-4caa-bdb8-408ef90561cf",
            "created": "2013-07-07T19:01:51",
            "default_image_id": "3f9fc974-b484-4756-82a4-bff9e116919b",
            "hadoop_version": "1.2.1",
            "id": "c5e755a2-b3f9-417b-948b-e99ed7fbf1e3",
            "info": {
                "HDFS": {
                    "Web UI": "http://172.24.4.225:50070"
                },
                "MapReduce": {
                    "Web UI": "http://172.24.4.225:50030"
                }
            },
            "name": "cluster-1",
            "node_groups": [
                {
                    "count": 1,
                    "created": "2013-07-07T19:01:51",
                    "flavor_id": "999",
                    "instances": [
                        {
                            "created": "2013-07-07T19:01:51",
                            "instance_id": "4f6dc715-9c65-4d74-bddd-5f1820e6ce02",
                            "instance_name": "cluster-1-master-001",
                            "internal_ip": "10.0.0.5",
                            "management_ip": "172.24.4.225",
                            "updated": "2013-07-07T19:06:07",
                            "volumes": []
                        }
                    ],
                    "name": "master",
                    "node_configs": {},
                    "node_group_template_id": "b38227dc-64fe-42bf-8792-d1456b453ef3",
                    "node_processes": [
                        "jobtracker",
                        "namenode"
                    ],
                    "updated": "2013-07-07T19:01:51",
                    "volume_mount_prefix": "/volumes/disk",
                    "volumes_per_node": 0,
                    "volumes_size": 10
                },
                {
                    "count": 2,
                    "created": "2013-07-07T19:01:51",
                    "flavor_id": "999",
                    "instances": [
                        {
                            "created": "2013-07-07T19:01:52",
                            "instance_id": "11089dd0-8832-4473-a835-d3dd36bc3d00",
                            "instance_name": "cluster-1-workers-001",
                            "internal_ip": "10.0.0.6",
                            "management_ip": "172.24.4.227",
                            "updated": "2013-07-07T19:06:07",
                            "volumes": []
                        },
                        {
                            "created": "2013-07-07T19:01:52",
                            "instance_id": "d59ee54f-19e6-401b-8662-04a156ba811f",
                            "instance_name": "cluster-1-workers-002",
                            "internal_ip": "10.0.0.7",
                            "management_ip": "172.24.4.226",
                            "updated": "2013-07-07T19:06:07",
                            "volumes": []
                        }
                    ],
                    "name": "workers",
                    "node_configs": {},
                    "node_group_template_id": "634827b9-6a18-4837-ae15-5371d6ecf02c",
                    "node_processes": [
                        "tasktracker",
                        "datanode"
                    ],
                    "updated": "2013-07-07T19:01:51",
                    "volume_mount_prefix": "/volumes/disk",
                    "volumes_per_node": 0,
                    "volumes_size": 10
                }
            ],
            "plugin_name": "vanilla",
            "status": "Active",
            "updated": "2013-07-07T19:06:24",
            "user_keypair_id": "stack"
        }
    ]
}

8. Run MapReduce job¶

To check that your Hadoop installation works correctly:

Go to NameNode via ssh:

$ ssh ubuntu@<namenode_ip>

Switch to hadoop user:

$ sudo su hadoop

Go to hadoop home directory and run the simpliest MapReduce example:

$ cd /usr/share/hadoop
$ hadoop jar hadoop-examples-1.2.1.jar pi 10 100

Congratulations! Now you have Hadoop cluster ready on the OpenStack cloud!

OpenStack

Quickstart guide¶

1. Install Sahara¶

2. Keystone endpoints setup¶

3. Upload image to Glance¶

4. Register image in Image Registry¶

5. Setup NodeGroup templates¶

6. Setup Cluster Template¶

7. Create cluster¶

8. Run MapReduce job¶

Table Of Contents

Previous topic

Next topic

This Page

OpenStack

Quickstart guide¶

1. Install Sahara¶

2. Keystone endpoints setup¶

3. Upload image to Glance¶

4. Register image in Image Registry¶

5. Setup NodeGroup templates¶

6. Setup Cluster Template¶

7. Create cluster¶

8. Run MapReduce job¶

Table Of Contents

Previous topic

Next topic

Quick search

This Page

Navigation