Title Page Copyright and Credits Advanced Machine Learning with R About Packt Why subscribe? Packt.com Contributors About the authors Packt is searching for authors like you Preface Who this book is for What this book covers To get the most out of this book Download the example code files Conventions used Get in touch Reviews Preparing and Understanding Data Overview Reading the data Handling duplicate observations Descriptive statistics Exploring categorical variables Handling missing values Zero and near-zero variance features Treating the data Correlation and linearity Summary Linear Regression Univariate linear regression Building a univariate model Reviewing model assumptions Multivariate linear regression Loading and preparing the data Modeling and evaluation – stepwise regression Modeling and evaluation – MARS Reverse transformation of natural log predictions Summary Logistic Regression Classification methods and linear regression Logistic regression Model training and evaluation Training a logistic regression algorithm Weight of evidence and information value Feature selection Cross-validation and logistic regression Multivariate adaptive regression splines Model comparison Summary Advanced Feature Selection in Linear Models Regularization overview Ridge regression LASSO Elastic net Data creation Modeling and evaluation Ridge regression LASSO Elastic net Summary K-Nearest Neighbors and Support Vector Machines K-nearest neighbors Support vector machines Manipulating data Dataset creation Data preparation Modeling and evaluation KNN modeling Support vector machine Summary Tree-Based Classification An overview of the techniques Understanding a regression tree Classification trees Random forest Gradient boosting Datasets and modeling Classification tree Random forest Extreme gradient boosting – classification Feature selection with random forests Summary Neural Networks and Deep Learning Introduction to neural networks Deep learning – a not-so-deep overview Deep learning resources and advanced methods Creating a simple neural network Data understanding and preparation Modeling and evaluation An example of deep learning Keras and TensorFlow background Loading the data Creating the model function Model training Summary Creating Ensembles and Multiclass Methods Ensembles Data understanding Modeling and evaluation Random forest model Creating an ensemble Summary Cluster Analysis Hierarchical clustering Distance calculations K-means clustering Gower and PAM Gower PAM Random forest Dataset background Data understanding and preparation Modeling  Hierarchical clustering K-means clustering Gower and PAM Random forest and PAM Summary Principal Component Analysis An overview of the principal components Rotation Data Data loading and review Training and testing datasets PCA modeling Component extraction Orthogonal rotation and interpretation Creating scores from the components Regression with MARS Test data evaluation Summary Association Analysis An overview of association analysis Creating transactional data Data understanding Data preparation Modeling and evaluation Summary Time Series and Causality Univariate time series analysis Understanding Granger causality Time series data Data exploration Modeling and evaluation Univariate time series forecasting Examining the causality Linear regression Vector autoregression Summary Text Mining Text mining framework and methods Topic models Other quantitative analysis Data overview Data frame creation Word frequency Word frequency in all addresses Lincoln's word frequency Sentiment analysis N-grams Topic models Classifying text Data preparation LASSO model Additional quantitative analysis Summary Exploring the Machine Learning Landscape ML versus software engineering Types of ML methods Supervised learning Unsupervised learning Semi-supervised learning Reinforcement learning Transfer learning ML terminology – a quick review Deep learning Big data Natural language processing Computer vision Cost function Model accuracy Confusion matrix Predictor variables Response variable Dimensionality reduction Class imbalance problem Model bias and variance Underfitting and overfitting Data preprocessing Holdout sample Hyperparameter tuning Performance metrics Feature engineering Model interpretability ML project pipeline Business understanding Understanding and sourcing the data Preparing the data  Model building and evaluation Model deployment Learning paradigm Datasets Summary Predicting Employee Attrition Using Ensemble Models Philosophy behind ensembling  Getting started Understanding the attrition problem and the dataset  K-nearest neighbors model for benchmarking the performance Bagging Bagged classification and regression trees (treeBag) implementation Support vector machine bagging (SVMBag) implementation Naive Bayes (nbBag) bagging implementation Randomization with random forests Implementing an attrition prediction model with random forests Boosting  The GBM implementation Building attrition prediction model with XGBoost Stacking  Building attrition prediction model with stacking Summary Implementing a Jokes Recommendation Engine Fundamental aspects of recommendation engines Recommendation engine categories Content-based filtering Collaborative filtering Hybrid filtering Getting started Understanding the Jokes recommendation problem and the dataset Converting the DataFrame Dividing the DataFrame Building a recommendation system with an item-based collaborative filtering technique Building a recommendation system with a user-based collaborative filtering technique Building a recommendation system based on an association-rule mining technique The Apriori algorithm Content-based recommendation engine Differentiating between ITCF and content-based recommendations Building a hybrid recommendation system for Jokes recommendations Summary References Sentiment Analysis of Amazon Reviews with NLP The sentiment analysis problem Getting started Understanding the Amazon reviews dataset Building a text sentiment classifier with the BoW approach Pros and cons of the BoW approach Understanding word embedding Building a text sentiment classifier with pretrained word2vec word embedding based on Reuters news corpus Building a text sentiment classifier with GloVe word embedding Building a text sentiment classifier with fastText Summary Customer Segmentation Using Wholesale Data Understanding customer segmentation Understanding the wholesale customer dataset and the segmentation problem Categories of clustering algorithms Identifying the customer segments in wholesale customer data using k-means clustering Working mechanics of the k-means algorithm Identifying the customer segments in the wholesale customer data using DIANA Identifying the customer segments in the wholesale customers data using AGNES Summary Image Recognition Using Deep Neural Networks Technical requirements Understanding computer vision Achieving computer vision with deep learning Convolutional Neural Networks Layers of CNNs Introduction to the MXNet framework Understanding the MNIST dataset Implementing a deep learning network for handwritten digit recognition Implementing dropout to avoid overfitting Implementing the LeNet architecture with the MXNet library Implementing computer vision with pretrained models Summary Credit Card Fraud Detection Using Autoencoders Machine learning in credit card fraud detection Autoencoders explained Types of AEs based on hidden layers Types of AEs based on restrictions Applications of AEs The credit card fraud dataset Building AEs with the H2O library in R Autoencoder code implementation for credit card fraud detection Summary Automatic Prose Generation with Recurrent Neural Networks Understanding language models Exploring recurrent neural networks Comparison of feedforward neural networks and RNNs Backpropagation through time Problems and solutions to gradients in RNN Exploding gradients Vanishing gradients Building an automated prose generator with an RNN Implementing the project Summary Winning the Casino Slot Machines with Reinforcement Learning Understanding RL Comparison of RL with other ML algorithms Terminology of RL The multi-arm bandit problem Strategies for solving MABP The epsilon-greedy algorithm Boltzmann or softmax exploration Decayed epsilon greedy The upper confidence bound algorithm Thompson sampling Multi-arm bandit – real-world use cases Solving the MABP with UCB and Thompson sampling algorithms Summary Creating a Package Creating a new package Summary Other Books You May Enjoy Leave a review - let other readers know what you think