Preface

Scala is a popular language for data science. By emphasizing immutability and functional constructs, Scala lends itself well to the construction of robust libraries for concurrency and big data analysis. A rich ecosystem of tools for data science has therefore developed around Scala, including libraries for accessing SQL and NoSQL databases, frameworks for building distributed applications like Apache Spark and libraries for linear algebra and numerical algorithms. We will explore this rich and growing ecosystem in this learning path.

What this learning path covers

Module 1, Scala for Data Science, will introduce you to the libraries for ingesting, storing, manipulating, processing, and visualizing data in Scala. Packed with real-world examples and interesting data sets, this module will teach you to ingest data from flat files and web APIs and store it in a SQL or NoSQL database. It will show you how to design scalable architectures to process and modeling your data, starting from simple concurrency constructs such as parallel collections and futures, through to actor systems and Apache Spark. As well as Scala's emphasis on functional structures and immutability, you will learn how to use the right parallel construct for the job at hand, minimizing development time without compromising scalability. Finally, you will learn how to build beautiful interactive visualizations using web frameworks. This module gives tutorials on some of the most common Scala libraries for data science, allowing you to quickly get up to speed with building data science and data engineering solutions.

Module 2, Scala Data Analysis Cookbook, will introduce you to the most popular Scala tools, libraries, and frameworks through practical recipes around loading, manipulating, and preparing your data. It will also help you explore and make sense of your data using stunning and insightful visualizations, and machine learning toolkits.Starting with introductory recipes on utilizing the Breeze and Spark libraries, get to grips with how to import data from a host of possible sources and how to pre-process numerical, string, and date data. Next, you'll get an understanding of concepts that will help you visualize data using the Apache Zeppelin and Bokeh bindings in Scala, enabling exploratory data analysis. Discover how to program quintessential machine learning algorithms using Spark ML library. Work through steps to scale your machine learning models and deploy them into a standalone cluster, EC2, YARN, and Mesos. Finally dip into the powerful options presented by Spark Streaming, and machine learning for streaming data, as well as utilizing Spark GraphX.

Module 3, Scala for Machine Learning, will introduce you to the functional capabilities of the Scala programming language that are critical to the creation of machine learning algorithms such as dependency injection and implicits.Your learning journey starts with data pre-processing and filtering techniques, then move on to clustering and dimension reduction, Naïve Bayes, regression models, sequential data, regularization and kernelization, support vector machines, Neural networks, generic algorithms and re-enforcement learning. The review of the Akka framework and Apache Spark clusters concludes the tutorial. Techniques throughout the module is applied to the analysis, recommendation, classification, and prediction of financial markets.

This module will guide you through the process of building AI applications with diagrams, formal mathematical notation, source code snippets and useful tips.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset