Hivemall is a scalable machine learning library built on top of Apache Hive and Hadoop. It is a collection of machine learning algorithms that are created as User Defined Functions (UDFs) and User Defined Table Functions (UDTFs). Hivemall offers the following benefits:
Follow this procedure to get started:
[cloudera@quickstart ~]$ wget https://github.com/myui/hivemall/releases/download/v0.4.2-rc.2/hivemall-core-0.4.2-rc.2-with-dependencies.jar
hive> add jar hivemall-core-0.4.2-rc.2-with-dependencies.jar; hive> source define-all.hive;