Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Karthik Ramasubramanian and

Abhishek Singh

Machine Learning Using R

Karthik Ramasubramanian

New Delhi, Delhi, India

Abhishek Singh

New Delhi, Delhi, India

Any source code or other supplementary materials referenced by the author in this text is available to readers at www.apress.com . For detailed information about how to locate your book’s source code, go to www.apress.com/source-code/ .

ISBN 978-1-4842-2333-8

e-ISBN 978-1-4842-2334-5

DOI 10.1007/978-1-4842-2334-5

Library of Congress Control Number: 2016961515

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.

Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.

While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

Printed on acid-free paper

Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail [email protected], or visit www.springer.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.

To our parents for being the guiding light and a strong pillar of support.

And to our decade-long friendship.

Acknowledgments

We are grateful to our teachers, open source communities, and colleagues for enriching us with knowledge and confidence to bring the first edition of this book. The knowledge in this book is an accumulation of a number of years of research work and professional experience gained at our alma mater and industry. We are grateful to Prof R. Nadarajan and Prof R. Anitha, Department of Applied Mathematics and Computational Sciences, PSG College of Technology, Coimbatore, for their continuous support and encouragement for our efforts in the machine learning field.

In the rapidly changing world, the field of machine learning is evolving very fast and most of the latest developments are driven by the open source platform. We thank all the developers and contributors across the globe who are freely sharing their knowledge about these platforms. We also want to thank our colleagues from Snapdeal, Deloitte, and our current organizations—Hike and Prudential—for providing opportunities to experiment and create cutting-edge data science solutions.

Karthik especially would like to thank his father, Mr. S Ramasubramanian, for always being a source of inspiration in his life. He is immensely thankful to his supervisor, Mr. Nikhil Dwarakanath, director of the data science team at Snapdeal, for creating the opportunities to bring about the best analytics professional in him and providing the motivation to take up challenging projects.

Abhishek would like to thank his father, Mr. Charan Singh, a senior scientist in the India meteorological department, for introducing him to the power of data in weather forecasting in his formative years. On a personal front, Abhishek would like to thank his mother Jaya, sister Asweta, and brother Avilash for their continuous moral support.

We want to thank our publisher Apress, specifically Celestine, for proving us with this opportunity, Sanchita, Prachi for managing this project, Poonam and Piyush for their reviews, and everybody involved in the production team.

—Karthik Ramasubramanian
—Abhishek Singh

Contents at a Glance

About the Authors
About the Technical Reviewer
Acknowledgments
Chapter 1: Introduction to Machine Learning and R
Chapter 2: Data Preparation and Exploration
Chapter 3: Sampling and Resampling Techniques
Chapter 4: Data Visualization in R
Chapter 5: Feature Engineering
Chapter 6: Machine Learning Theory and Practices
Chapter 7: Machine Learning Model Evaluation
Chapter 8: Model Performance Improvement
Chapter 9: Scalable Machine Learning and Related Technologies
Index

About the Authors and About the Technical Reviewer

About the Authors

Karthik Ramasubramanian works for one of the largest and fastest growing technology unicorns in India, Hike Messenger. He brings the best of business analytics and data science experience to his role at Hike Messenger. In his seven years of research and industry experience, he has worked on cross-industry data science problems in retail, e-commerce, and technology, developing and prototyping data-driven solutions. In his previous role at Snapdeal, one of the largest e-commerce retailers in India, he was leading core statistical modeling initiatives for customer growth and pricing analytics. Prior to Snapdeal, he was part of central database team, managing the data warehouses for global business applications of Reckitt Benckiser (RB). He has vast experience working with scalable machine learning solutions for industry, including sophisticated graph network and self-learning neural networks. He has a Master’s in Theoretical Computer Science from PSG College of Technology, Anna University, and is a certified big data professional. He is passionate about teaching and mentoring future data scientists through different online and public forums. He enjoys writing poems in his leisure time and is an avid traveler.

Abhishek Singh is a data scientist in the advanced data science team of Prudential Financial Inc., the second largest life insurance provider in the United States, and is based out of Ireland. He has five years of professional and academic experience in data science, spanning across consulting, teaching, and financial services. At Deloitte Advisory, he led risk analytics initiatives for top U.S. banks in their regulatory risk, credit risk, and balance sheet modeling requirements. In his current role, he is working on scalable machine learning algorithms for individual life insurance business of Prudential. He has working experience in time series models and has worked with cross-functional teams to implement data science solutions in enterprise infrastructure. He has been an active trainer at Deloitte Professional University and led training and development initiatives for professionals in the areas of statistics, economics, financial risk, and data science tools (SAS and R). He has a B.Tech. in mathematics and computing from the Indian Institute of Technology, Guwahati, and an MBA from the Indian Institute of Management, Bangalore. He speaks at public events on data science and works with leading universities toward bringing data science skills to graduates. He has keen interest in law and holds a Post Graduate Diploma in Cyber Law from NALSAR University. He enjoys cooking and photography during his free hours.

About the Technical Reviewer

Jojo Moolayil is a data scientist and the author of the book, Smarter Decisions – The Intersection of Internet of Things and Decision Science . With over four years of industrial experience in data science, decision science, and IoT, he has worked with industry leaders on high impact and critical projects across multiple verticals. He is currently associated with General Electric, the pioneer and leader in data science for industrial IoT and lives in Bengaluru—the silicon valley of India.

He was born and raised in Pune, India and graduated from the University of Pune with a major in information technology engineering. He started his career with Mu Sigma Inc., the world’s largest pure play analytics provider and worked with the leaders of many Fortune 50 clients. One of the early enthusiasts to venture into IoT analytics, he converged his knowledge of decision science to bring the problem-solving frameworks and his knowledge of data and decision science to IoT analytics.

To cement his foundation in data science for industrial IoT and scale the impact of the problem-solving experiments, he joined a fast-growing IoT analytics startup called Flutura, based in Bangalore and headquartered in the valley. After a short stint with Flutura, Jojo moved on to work with the leaders of industrial IoT—General Electric, in Bangalore, where he focused on solving decision science problems for industrial IoT use cases. As a part of his role at GE, Jojo also focuses on developing data science and decision science products and platforms for industrial IoT.

Apart from authoring books on decision science and IoT, Jojo has also been technical reviewer for books on machine learning and business analytics with Apress. He is an active data science tutor and also maintains a blog at http://www.jojomoolayil.com/web/blog/ .

Profile: http://www.jojomoolayil.com/

https://www.linkedin.com/in/jojo62000

“I would like to thank my family, friends, and mentors for their kind support and constant motivation throughout my life.”

—Jojo Moolayil

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Frontmatter

Create new playlist

Sign In

Sign Up