13. Deploying Streams and Tasks on Apache Mesos and Marathon/Chronos

In this getting started the Data Flow Server is running as a Docker container in Marathon and so are all the dependent services like a relational database for stream and task repositories, a message bus for stream apps and the key value store for analytics.

  1. Deploy a Mesos and Marathon cluster.

    The Mesosphere getting started guide provides a number of options for you to deploy a cluster. There is also a number of options listed on Mesosphere’s Install DC/OS page. In Appendix B, Test Cluster we describe how we configured a local test cluster using the DC/OS Vagrant project.

    The rest of this getting started guide assumes that you have a working Mesos and Marathon cluster and know the Marathon endpoint URL.

    We are using the Marathon endpoint URL of http://m1.dcos/service/marathon for this document.

  2. Create a MySQL service on the Mesos cluster.

    The mysql service will be used for storing stream and task definitions in the stream and task repositories. There is a sample application JSON file for MySQL in the spring-cloud-dataflow-server-mesos repository that you can use as a starting point. The service discovery mechanism is currently disabled so you need to look up the host and port to use for the connection. Depending on how large your cluster is, you may want to tweak the CPU and/or memory values.

    Using the above JSON file and an Mesos and Marathon cluster installed you can deploy a Rabbit MQ application instance by issuing the following command

    curl -X POST http://m1.dcos/service/marathon/v2/apps -d @mysql.json -H "Content-type: application/json"
    [Note]Note

    Note the @ symbol to reference a file input for the curl command.

  3. Create a Rabbit MQ service on the Mesos cluster.

    The rabbitmq service will be used for messaging between applications in the stream. There is a sample application JSON file for Rabbit MQ in the spring-cloud-dataflow-server-mesos repository that you can use as a starting point. The service discovery mechanism is currently disabled so you need to look up the host and port to use for the connection. Depending on how large your cluster is, you may want to tweak the CPU and/or memory values.

    Using the above JSON file and an Mesos and Marathon cluster installed you can deploy a Rabbit MQ service instance by issuing the following command

    curl -X POST http://m1.dcos/service/marathon/v2/apps -d @rabbitmq.json -H "Content-type: application/json"
  4. Create a Redis service on the Mesos cluster.

    The redis service will be used for counters as part of the analytics. There is a sample application JSON file for Redis in the spring-cloud-dataflow-server-mesos repository that you can use as a starting point. The service discovery mechanism is currently disabled so you need to look up the host and port to use for the connection. Depending on how large your cluster is, you may want to tweak the CPU and/or memory values.

    Using the above JSON file and an Mesos and Marathon cluster installed you can deploy a Redis service instance by issuing the following command

    curl -X POST http://m1.dcos/service/marathon/v2/apps -d @redis.json -H "Content-type: application/json"

    Using the Marathon and Mesos UIs you can verify that mysql, rabbitmq and redis services are running on the cluster.

  5. Install Chronos on the Mesos cluster.

    The Chronos service will be used for running task. If you haven’t already installed Chronos, this is the time to do it. You can install Chronos using the DC/OS UI (under the Universe section) or from the dcos command line:

    dcos package install chronos
  6. Download the Marathon application JSON for the Spring Cloud Data Flow Server.

    Use the following command to download the Marathon application JSON file used to deploy Spring CLoud Data Flow Server for Mesos.

    $ wget https://raw.githubusercontent.com/spring-cloud/spring-cloud-dataflow-server-mesos/v1.0.0.RELEASE/src/etc/marathon/scdf-server.json

    We will need to modify the Docker image tag, API endpoints and host and port settings based on the current deployment environment. We hope to eliminiate most of this in future releases and instead rely on service discovery mechanisms. For now we do have to make the modifications manually. The downloaded file should look like this:

    {
      "id": "/spring-cloud-data-flow",
      "cpus": 0.5,
      "mem": 512.0,
      "instances": 1,
      "container": {
        "type": "DOCKER",
        "docker": {
          "image": "springcloud/spring-cloud-dataflow-server-mesos",
          "network": "BRIDGE",
          "portMappings": [
            { "containerPort": 9393 }
          ]
        }
      },
      "env": {
        "MESOS_MARATHON_URI": "http://m1.dcos/service/marathon",
        "MESOS_CHRONOS_URI": "http://m1.dcos/service/chronos",
        "JDBC_URL": "jdbc:mysql://192.168.65.111:5769/test",
        "JDBC_DRIVER": "org.mariadb.jdbc.Driver",
        "JDBC_USERNAME": "spring",
        "JDBC_PASSWORD": "secret",
        "RABBITMQ_HOST": "192.168.65.121",
        "RABBITMQ_PORT": "5261",
        "REDIS_HOST": "192.168.65.111",
        "REDIS_PORT": "19902",
        "SPRING_APPLICATION_JSON": "{\"spring.cloud.deployer.mesos.marathon.apiEndpoint\":\"${MESOS_MARATHON_URI}\",\"spring.cloud.deployer.mesos.chronos.apiEndpoint\":\"${MESOS_CHRONOS_URI}\",\"spring.datasource.url\":\"${JDBC_URL}\",\"spring.datasource.driverClassName\":\"${JDBC_DRIVER}\",\"spring.datasource.username\":\"${JDBC_USERNAME}\",\"spring.datasource.password\":\"${JDBC_PASSWORD}\",\"spring.datasource.testOnBorrow\":true,\"spring.datasource.validationQuery\":\"SELECT 1\",\"spring.redis.host\":\"${REDIS_HOST}\",\"spring.redis.port\":\"${REDIS_PORT}\",\"spring.cloud.deployer.mesos.marathon.environmentVariables\":\"SPRING_RABBITMQ_HOST=${RABBITMQ_HOST},SPRING_RABBITMQ_PORT=${RABBITMQ_PORT}\"}"
      },
      "healthChecks": [
        {
          "path": "/management/health",
          "portIndex": 0,
          "protocol": "HTTP",
          "ignoreHttp1xx": false,
          "gracePeriodSeconds": 120,
          "intervalSeconds": 60,
          "timeoutSeconds": 20,
          "maxConsecutiveFailures": 0
        }
      ]
    }

    First we need to modify the Docker image to use the tag 1.0.0.RELEASE. It should be:

        "image": "springcloud/spring-cloud-dataflow-server-mesos:1.0.0.RELEASE",

    In the env section there are several environment variables that we need to adjust. We need to provide the following properties for accessing Marathon and Chronos APIs:

        "MESOS_MARATHON_URI": "http://m1.dcos/service/marathon",
        "MESOS_CHRONOS_URI": "http://m1.dcos/service/chronos",

    Here we did set them to the defaults for a local Vagrant DC/OS installation.

     

    We also need to provide the database configuration properties. Look up the database host and port from the Marathon UI. For the mysql service that we just installed, they were 192.168.65.111:5769.

        "JDBC_URL": "jdbc:mysql://192.168.65.111:5769/test",
        "JDBC_DRIVER": "org.mariadb.jdbc.Driver",
        "JDBC_USERNAME": "spring",
        "JDBC_PASSWORD": "secret",

    Next, we need to provide the message bus configuration properties. Look up the host and port for the rabbitmq service. In our case they were 192.168.65.121:5261.

        "RABBITMQ_HOST": "192.168.65.121",
        "RABBITMQ_PORT": "5261",

    Finally we need to do the same for the key value store properties. Look up the host and port for the redis service. In our case they were 192.168.65.111:19902.

        "REDIS_HOST": "192.168.65.111",
        "REDIS_PORT": "19902",

    You can add properties to the SPRING_APPLICATION_JSON property as well. You might want to set default values for memory and cpu resource request. For example \"spring.cloud.deployer.mesos.marathon.memory\"=\"768\" will by default allocate additional memory for the application vs. the default value of 512. You can see all the available options in the MarathonAppDeployerProperties.java file.

    [Note]Note

    DC/OS in secured mode requires an Authorization header with a token when accessing the Marathon and Chronos REST end-points. To accommodate this you need to provide this token when deploying the Spring Cloud Data Flow server to a DC/OS secured cluster. See below for instructions.

    If you are using a secured DC/OS cluster then you will need to add the authorization token to the above configuration. First, log in using dcos auth login command and enter the authentication token you get after authenticating with DC/OS web interface with the provided link. Next, run the dcos config show core.dcos_acs_token command to display the authorization token we need for our configuration. Copy and paste this token in the following environment variable that you add to the above application JSON:

        "DCOS_TOKEN": "<paste the token here>",

    We also need to add the spring.cloud.deployer.mesos.dcos.authorizationToken property to the SPRING_APPLICATION_JSON entry. Insert the following as another property entry:

    ,\"spring.cloud.deployer.mesos.dcos.authorizationToken\":\"${DCOS_TOKEN}\"
  7. Now, deploy the Spring Cloud Data Flow Server for Mesos and Marathon/Chronos using the above modified application JSON.

    curl -X POST http://m1.dcos/service/marathon/v2/apps -d @scdf-server.json -H "Content-type: application/json"

    Verify that the spring-cloud-data-flow application is running before proceeding.

  8. Download and run the Spring Cloud Data Flow shell.

    $ wget http://repo.spring.io/release/org/springframework/cloud/spring-cloud-dataflow-shell/1.0.1.RELEASE/spring-cloud-dataflow-shell-1.0.1.RELEASE.jar
    
    $ java -jar spring-cloud-dataflow-shell-1.0.1.RELEASE.jar

    Lookup the host and port for the spring-cloud-data-flow application in the Marathon UI. Use those values to configure the server URI for the shell:

    dataflow:>dataflow config server --uri http://192.168.65.111:20043
  9. By default, the application registry will be empty. If you would like to register all out-of-the-box stream applications built with the RabbitMQ binder in bulk, you can with the following command. For more details, review how to register applications.

    dataflow:>app import --uri http://bit.ly/stream-applications-rabbit-docker
  10. Deploy a simple stream in the shell

    [Note]Note

    If you need to specify any of the app specific configuration properties then you must use "long-form" of them including the app specific prefix like --jdbc.tableName=TEST_DATA. This is due to the server not being able to access the metadata for the Docker based starter apps. You will also not see the configuration properties listed when using the app info command or in the Dashboard GUI.

    dataflow:>stream create --name ticktock --definition "time | log" --deploy

    In the Mesos UI you can then look at the logs for the log sink. Look for a Mesos task with the name log-0.log.ticktock.

    2016-04-26 18:13:03.001  INFO 1 --- [           main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 8080 (http)
    2016-04-26 18:13:03.004  INFO 1 --- [           main] o.s.c.s.a.l.s.r.LogSinkRabbitApplication : Started LogSinkRabbitApplication in 7.766 seconds (JVM running for 8.24)
    2016-04-26 18:13:54.443  INFO 1 --- [nio-8080-exec-1] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring FrameworkServlet 'dispatcherServlet'
    2016-04-26 18:13:54.445  INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet        : FrameworkServlet 'dispatcherServlet': initialization started
    2016-04-26 18:13:54.459  INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet        : FrameworkServlet 'dispatcherServlet': initialization completed in 14 ms
    2016-04-26 18:14:09.088  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:09
    2016-04-26 18:14:10.077  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:10
    2016-04-26 18:14:11.080  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:11
    2016-04-26 18:14:12.083  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:12
    2016-04-26 18:14:13.090  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:13
    2016-04-26 18:14:14.091  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:14
    2016-04-26 18:14:15.093  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:15
    2016-04-26 18:14:16.095  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:16
  11. Destroy the stream

    dataflow:>stream destroy --name ticktock
  12. Register a task application using the shell

    dataflow:>app register --name timestamp --type task --uri docker:springcloudtask/timestamp-task:latest
  13. Create and launch the task using the shell

    dataflow:>task create testtask --definition "timestamp"
    dataflow:>task launch testtask

    In the Mesos UI you can then look at the logs for the testtask task. Look for a Mesos task with the name ChronosTask:testtask.

    Starting task ct:1472062219364:0:testtask:
    
      .   ____          _            __ _ _
     /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
    ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
     \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
      '  |____| .__|_| |_|_| |_\__, | / / / /
     =========|_|==============|___/=/_/_/_/
     :: Spring Boot ::        (v1.3.5.RELEASE)
    
    2016-08-24 18:10:45.957  INFO 1 --- [           main] o.s.c.t.a.t.TimestampTaskApplication     : Starting TimestampTaskApplication v1.0.2.BUILD-SNAPSHOT on a2.dcos with PID 1 (/maven/timestamp-task.jar started by root in /)
    2016-08-24 18:10:45.960  INFO 1 --- [           main] o.s.c.t.a.t.TimestampTaskApplication     : No active profile set, falling back to default profiles: default
    2016-08-24 18:10:46.003  INFO 1 --- [           main] s.c.a.AnnotationConfigApplicationContext : Refreshing org.spring[email protected]788c6159: startup date [Wed Aug 24 18:10:46 GMT 2016]; root of context hierarchy
    2016-08-24 18:10:47.051  INFO 1 --- [           main] o.s.jdbc.datasource.init.ScriptUtils     : Executing SQL script from class path resource [org/springframework/cloud/task/schema-mysql.sql]
    2016-08-24 18:10:47.062  INFO 1 --- [           main] o.s.jdbc.datasource.init.ScriptUtils     : Executed SQL script from class path resource [org/springframework/cloud/task/schema-mysql.sql] in 11 ms.
    2016-08-24 18:10:47.207  INFO 1 --- [           main] o.s.j.e.a.AnnotationMBeanExporter        : Registering beans for JMX exposure on startup
    2016-08-24 18:10:47.211  INFO 1 --- [           main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 0
    2016-08-24 18:10:47.238  INFO 1 --- [           main] TimestampTaskConfiguration$TimestampTask : 2016-08-24 18:10:47.238
    2016-08-24 18:10:47.249  INFO 1 --- [           main] s.c.a.AnnotationConfigApplicationContext : Closing org.spring[email protected]788c6159: startup date [Wed Aug 24 18:10:46 GMT 2016]; root of context hierarchy
    2016-08-24 18:10:47.250  INFO 1 --- [           main] o.s.c.support.DefaultLifecycleProcessor  : Stopping beans in phase 0
    2016-08-24 18:10:47.252  INFO 1 --- [           main] o.s.j.e.a.AnnotationMBeanExporter        : Unregistering JMX-exposed beans on shutdown
    2016-08-24 18:10:47.261  INFO 1 --- [           main] o.s.c.t.a.t.TimestampTaskApplication     : Started TimestampTaskApplication in 1.62 seconds (JVM running for 2.018)```
  14. Destroy the task

    dataflow:>task destroy --name testtask