Chapter 2. Prerequisites

Before installing Spark, make sure your cluster meets the following prerequisites.

 

Table 2.1. Prerequisites for Running Spark 1.5.2

PrerequisiteDescription
HDP Cluster Stack Version
  • 2.3.4 or later

(Optional) Ambari Version
  • 2.2 or later

Software dependencies

  • Spark requires HDFS and YARN

  • PySpark requires Python to be installed on all nodes

  • SparkR (tech preview) requires R to be installed on all nodes

  • (Optional) For optimal performance with MLlib, consider installing the netlib-java library.


[Note]Note

When you upgrade to HDP 2.3.4, Spark is automatically upgraded to 1.5.2. If you wish to use a previous version of Spark, follow the Spark Manual Downgrade procedure in the Release Notes.