Summary

In this chapter, we saw how to develop a machine learning (ML) project using H2O on a bank marketing dataset for predictive analytics. We were able to predict that the client would subscribe to a term deposit with an accuracy of 80%. Furthermore, we saw how to tune typical neural network hyperparameters. Considering the fact that this small-scale dataset, final improvement suggestion would be using Spark based Random Forest, Decision trees or gradient boosted trees for better accuracy.

In the next chapter, we will use a dataset having more than 284,807 instances of credit card use, where only 0.172% of transactions are fraudulent—that is, highly unbalanced data. So it would make sense to use autoencoders to pretrain a classification model and apply anomaly detection to predict possible fraud transaction—that is, we expect our fraud cases to be anomalies within the whole dataset.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset