This book focused on the practical side of machine learning. We did not present the thinking behind the algorithms or the theory that justifies them. If you are interested in that aspect of machine learning, then we recommend Pattern Recognition and Machine Learning, C. Bishop , Springer Apply Italics to this. This is a classical introductory text in the field. It will teach you the nitty-gritties of most of the algorithms we used in this book.
If you want to move beyond an introduction and learn all the gory mathematical details, then Machine Learning: A Probabilistic Perspective, K. Murphy, The MIT Press, is an excellent option. It is very recent (published in 2012), and contains the cutting edge of ML research. This 1,100 page book can also serve as a reference, as very little of machine learning has been left out.
The following are the two Q&A websites of machine learning:
As mentioned in the beginning of the book, if you have questions specific to particular parts of the book, feel free to ask them at TwoToReal (http://www.twotoreal.com). We try to be as quick as possible to jump in and help as best as we can.
The following is an obviously non-exhaustive list of blogs that are interesting to someone working on machine learning:
If you want to play around with algorithms, you can obtain many datasets from the Machine Learning Repository at University of California at Irvine (UCI). You can find it at http://archive.ics.uci.edu/ml.
An excellent way to learn more about machine learning is by trying out a competition! Kaggle (http://www.kaggle.com) is a marketplace of ML competitions and has already been mentioned in the introduction. On the website, you will find several different competitions with different structures and often cash prizes.
The supervised learning competitions almost always follow the following format:
Of course, winning something is nice, but you can gain a lot of useful experience just by participating. So, you have to stay tuned, especially after the competition is over and participants start sharing their approaches in the forum. Most of the time, winning is not about developing a new algorithm; it is about cleverly preprocessing, normalizing, and combining the existing methods.