Chapter 7. Accessing Hive Tables from Spark

This chapter describes how to access Hive data from Spark.

Spark SQL is a Spark module for structured data processing. It supports Hive data formats, user-defined functions (UDFs), and the Hive metastore, and can act as a distributed SQL query engine. You can also use Spark SQL to incorporate Hive table data into DataFrames (see "Using the Spark DataFrame API").
"Hive on Spark" enables Hive to run on Spark; Spark operates as an execution backend for Hive queries.