Apache Hive supports three execution engines—MapReduce, Tez, and Spark. Though Hivemall does not support Spark natively, the Hivemall for Spark project (https://github.com/maropu/hivemall-spark) implements a wrapper for Spark. This wrapper enables you to use Hivemall UDFs in SparkContext, DataFrames, or Spark Streaming. It is really easy to get started with Hivemall for Spark. Follow this procedure to start a Scala shell, load UDFs, and execute SQLs:
define-udfs
script:[cloudera@quickstart ~]$ wget https://raw.githubusercontent.com/maropu/hivemall-spark/master/scripts/ddl/define-udfs.sh --no-check-certificate
packages
option:[cloudera@quickstart ~]$ spark-1.6.0-bin-hadoop2.6/bin/spark-shell --master local[*] --packages maropu:hivemall-spark:0.0.6
scala> :load define-udfs.sh
https://github.com/maropu/hivemall-spark/tree/master/tutorials