In the last two chapters, you saw some interesting problems revolving around the retail and e-commerce domains. You now know how to detect and predict shopping trends from shopping patterns as well as how to build recommendation systems. If you remember from Chapter 1, Getting started with R and Machine Learning that the applications of machine learning are diverse, we can apply the same concepts and techniques to solve a wide variety of problems in the real world. We will be tackling a completely new problem here, but hold on to what you have learnt because several concepts you learnt previously will come in handy soon!
In the next couple of chapters, we will be tackling a new problem related to the financial domain. We will be looking at the bank customers of a particular German bank who could be credit risks for the bank, based on some data that has been previously collected. We will perform descriptive and exploratory analysis on this data to highlight different potential features in the dataset and also look at their relationship with credit risk. In the next step, we will be building predictive models using machine learning algorithms and these data features to detect and predict customers who could be potential credit risks. You may remember that the two main things that we need to do this analysis to remain unchanged are data and algorithms.
You might be surprised to know that risk analysis is one of the top most focus areas of financial organizations including in banks, investment firms, insurance firms, and brokerage firms. Each of these organizations often has dedicated teams for solving problems revolving around risk analysis. Some examples of risk which are frequently analyzed include credit risk, sales risk, fraud related risks, and many more.
In this chapter, we will be focusing on the following topics:
Always remember that domain knowledge is essential before solving any machine learning problem because otherwise we will end up applying random algorithms and techniques blindly which may not give the right results.
Before we start tackling our next challenge, it will be useful to get an idea of the different types of analytics which broadly encompass the data science domain. We use a variety of data mining and machine learning techniques to solve different data problems. However, depending on the mechanism of the technique and its end result, we can broadly classify analytics into four different types which are explained next:
Most organizations do a lot of descriptive analytics and some amount of predictive analytics. However, it is really difficult to implement prescriptive analytics due to the ever changing business conditions and data streams and problems associated with that, the most common one being data sanitization issues. We will be touching upon descriptive analytics in this chapter before moving on to predictive analytics in the next chapter to solve our problem related to credit risk analytics.