Chapter 5

Automated Decisions and Business Innovation

Contents

Automated Decisions

The purpose of analytics is to find patterns of data in an organization’s data set. The reason for finding such patterns is to use them within the normal business activity to help maximize value from business operations. This chapter covers how to go about embedding this approach into the DNA of an organizational and management culture. It will break down the foggy marketing mantra “use analytics to maximize business value” and build a step-by-step approach to demystify and make this goal achievable.

There is a school of thought within analytics professionals to use analytics to empower superanalysts dubbed data scientists (Davenport, 2012), give them all the power of the analytics models, and let them figure out how to use that model output for maximum business value. The approach emphasized throughout this book is the democratization of analytics, not limited to a handful of problems where data scientists may be available. Data scientists are required to have skills in data and analytics as well as deep insights into the business, which is a very rare combination to find. Therefore, business decisions made with the help of a data scientist have to be limited to a handful of functional areas where such skill is available and affordable. The automated decision approach democratizes the use of analytics in business decisions by converting the subjectivity into an objective, well-defined, and transparent set of rules under a decision strategy. How to devise these rules and ensure they continue to deliver value from the analytics models is covered later in this chapter in a section on strategy validation and tuning. Whether this decision strategy rules–based approach is superior to the data scientist approach will be addressed in the last section of this chapter on business process innovation.

Decision Strategy

The takeaway from this entire book is actually buried into this chapter— dealing with and building effective decision strategies. Large and sophisticated data storage and data processing infrastructure, state-of-the-art algorithms or software packages for analytics, and Big Data toolsets—none of that matters if the decision strategy component is not properly designed, managed, and executed. By definition, analytics has a degree of gray area unlike a report run on historical data that exactly represents what has happened. This gray area requires a certain degree of flexibility to accommodate varying situations, scenarios, and options. Not all of these scenarios can easily be accommodated in a model since then each scenario will have to have its own model and not all scenarios will have enough training data available to build a meaningful model. Besides, if the scenarios are not mutually exclusive then the overlapping models with conflicting outputs become a very complex equation to solve. The scenarios are also dictated by market-changing conditions, competitor’s actions, and even environmental disasters. It is not possible to build models to cater to all that. Therefore, a model is built on historical data assuming “business as usual,” and then the dynamic scenarios are handled with decision strategies as to what should be done in certain situations given that the business knows what the model says.

Nassim Nicholas Taleb in his book Fooled By Randomness (2008) and later in a revised work in The Black Swan (2010) talks about the weakness of models when a surprising event (i.e., a black swan) occurs and renders the models useless. Surprise events will occur that have not been addressed by the historical data used to build a model, but the key is not to abandon models or be overly reliant on them. The trick is to find the balance in decision strategies to be able to adjust in near–real time during unexpected times, limiting the damage from automated decisions. It is not possible to retrain the model with the surprise event factored in because the recency factor and limited set of historical data for the surprise event will probably get skewed in the training, and by the time model is rebuilt, tested, and validated, the damage is probably done. A decision strategy, on the other hand, can be easily turned more conservative or aggressive depending on the circumstances or business objectives and allows for manual intervention instead of resorting to default decisions.

Earlier in the book we talked about the mortgage crisis of 2008 and the fact that risk models had correctly dubbed these risky customers as subprime, highlighting their high probability of defaulting on their mortgage loans. The analytics model was correct but the decision strategy was aggressive and led to a massive crisis. However, the mortgage-backed securities risk models that called them AAA were actually the problem of the models that had not factored in various performance variables that in hindsight we know should have been factored in. In that scenario, the strategies probably worked fine assuming the risk indicators were accurate from the model. There is no magic wand to build a system and allow it to continue to create value while we sit back, relax, and enjoy. Both the models and the decision strategies have to be carefully designed, executed, monitored, and tuned almost as a managerial life style of the Big Data era. However, understanding models, their inner workings, and the mathematics of algorithms is far more difficult for a midlevel manager or an executive versus managing the decision strategy, which is based on business transactions and operational flow. The input of the analytics model into a business decision has to be managed through the decision strategy, as that is where the real value of analytics comes to fruition. Understanding and managing a decision strategy comes naturally to a business operations manager as we shall see in the rest of this chapter.

Business Rules in Business Operations

There are business rules in every business operation responsible for carrying out the day-to-day transactions across sales, service, procurement, hiring, etc. These rules make up the decision criteria used by the business to identify and resolve situations maximizing value in favor of the business. All of these business rules have numerous decision variables (DVs) that need to be defined to complete a transaction. The DVs in the following examples are in double quotes:

■ If you want to offer a discount to a good customer, how to define “Good.”

■ If you don’t want to lend to a customer with poor credit, how to define “Poor.”

■ If you want to reroute a package using a premium service otherwise it may get delayed, how to define “May” get delayed.

■ If you want to stop a money transfer as a suspicious money laundering transaction, how to define “Suspicious.”

■ If you want to hire the most suitable candidate for a position, how to define “Suitable.”

All business operations deal with these types of questions. The answers to these questions are extremely important as a policy enforced on the employees in the trenches carrying out dozens if not hundreds of such transactions in a day. These are called business rules. What to do in a certain situation? How to make a decision? How to maximize the business value of each business decision? These and thousands more are questions that business rules answer. Business rules are well defined explaining the scenario and the response. There are two primary categories of business rules: expert and quantitative.

Expert Business Rules

Expert business rules, as the name suggests, are devised by people who have decades of experience in a certain line of business. Their specialization in a particular area makes them experienced in all kinds of situations and scenarios, and they understand the impacts of the business decisions on the bottom line very well. They become the policy designers, mentors, and go-to people for other employees to get advice on how to handle a certain business transaction. Most straightforward and business-as-usual rules (80% of normal business) are either baked into the operational systems or the employees are well trained to identify and address them. It is usually the 20% that really need an expert input. The experts also have some degree of leeway in their decision-making authority and are able to waive fees, offer discounts, approve cases, etc., as long as it is in the best interest of the business overall in their opinion.

They usually have their own standards to define DVs, Good, Poor, May, Suspicious, and Suitable. These definitions will vary from one expert to another driven from their education, training, experience, and general approach to their profession. When these experts leave an organization, they leave with a lot of institutional knowledge on decision making in unique scenarios.

Quantitative Business Rules

Quantitative business rules are based on hard numbers and absolute values for decision making. For example, a retailer may define a Good customer as someone who has been a customer for over one year and has purchased at least $300 worth of products or services. A bank may define a money transfer that is over $7,500 to be suspicious for money laundering if the customer has maintained an average monthly balance in the account of less than $200. The reason why these types of rules are quantitative is because these DVs have been arrived at after extensive analysis of historical data and these values were found to be the best cutoffs for these business decisions.

Decision Automation and Business Rules

When business operations get insights from historical data through the analytics models, the state of these business rules changes because they now need to be reviewed in light of additional knowledge. On the other hand, because of these new insights, new rules will be needed to put the insights to work. These new rules can be subjective or expert-driven in some unique situations where enough historical data is not available, such as using Facebook activity to assess the suitability of a candidate for a job or Twitter activity as a basis for identifying premium customers. There are other areas of machine and sensor data that have been brought into the analytical space fairly recently and therefore not enough quantitative evidence is available. Some examples of this are additional sensors in cars that record and report on driving habits or smart meters for electricity and gas replacing the decades’ old analog meters.

The business rules that make up the decision strategies have to follow a structured and documented process regardless if they are expert or quantitative. Quantitative rules should be preferred but the rule creation or design process should carefully look at the analytics problem statement, the model’s output, and the context of the decision being made to identify the DVs and the appropriate cutoffs.

Joint Business and Analytics Sessions for Decision Strategies

The design of decision strategies requires input from business for two main reasons:

1. They understand their business processes and rules very well.

2. They need to alter their business activities driven from analytics so their buy-in is needed.

It is very unlikely that the team that has built the analytics model would actually understand how to put it to work for business improvement. Just the fact that you can reliably predict which customers are likely to defect, which loans are going to default, or which products always sell together doesn’t mean you also know what to do about this information. Therefore, joint sessions are needed where the results of model validation should be shared with business. Explain to the business units what patterns in historical data do for future trends and then pose the question as to how this would get utilized.

Drawing a treelike structure with some arbitrary choices of DVs and their cutoffs will get analytically savvy business people excited and they will soon be building complex decision rules and strategies. They should be able to build these strategy trees in any flowchart-type drawing tool like Visio or PowerPoint.

The following examples will explain decision strategy design in more detail.

Examples of Decision Strategy

We will use two very famous and well-documented examples of decision strategy to help describe the rules that make up the decision strategies, the DVs, and their values used for decisions, and how to validate, manage, and tune a decision strategy used to make automated business decisions on analytics models.

Retail Bank

A midsize retail bank has a car loan lending product for consumers. So consumers apply for a car loan with the bank and the bank, after due diligence, decides to either accept or reject the loan application. If it accepts the car loan application, the bank funds the customer for the car loan and then manages the loan servicing over the term period to receive payments and earn interest income. Analytics is heavily used in this business to determine the risk of default on that loan. So the bank needs to know the probability that a certain customer requesting a loan will not be able to pay. Here is what a bank needs to do:

1. The bank takes the loan data from its historical data set and identifies the loans that were properly repaid and the ones that defaulted.

2. The bank also gets predictive analytics software to build a predictive loan default model.

3. The algorithm takes the data set as an input (uses 90% of the data to build the model), identifies the patterns in data that differentiates a good loan versus a loan gone bad, and trains the model.

4. The model is tested using the 10% of the data that was set aside for this purpose.

5. The model is tuned as necessary (addition, removal, or redesign of the performance variables used in the model) until it is ready for production.

These steps are common for every analytics modeling problem and every type of analytics technique, its algorithm, and its model. Now that the model is ready, decision strategy comes into play. Let’s say the model is implemented in standalone software and loan applications come in through a web-based system. As a loan application comes in, it is passed to the software housing the analytics model and the model returns a probability of default back to the application processing system. This is done instantaneously so the customer may actually be sitting in front of a web page awaiting a response. What should be the response that can be sent back to the customer in real time? Do we show the probability coming back from the analytics software? That would not mean anything to the customer, so somewhere this probability has to be translated into an appropriate “Congratulations! Your loan has been accepted” or “Sorry! Your loan application has been denied” message. This is the job of the decision strategy. The risk and credit officers at the bank have built a strategy around the probability that is received from the predictive model.

The strategy includes several decision variables and their thresholds for decision making. In this example typical decision variables include:

1. Probability of default, between 0 and 1

2. Loan amount requested

3. Down payment

There is an additional set of policy variables where certain values mean outright rejection, such as convicted criminals, falsified records, identity theft or fraud indicators, age below the policy requirement, etc. The decision strategy for the loan approval will be as follows:

IF any of the Policy Variables are TRUE

THEN Reject

ELSE

  IF the Probability of Default > 0.62 (62%)

  THEN Reject

  ELSE

    IF Probability of Default is between 0.28 and 0.62

    THEN IF Loan Amount Requested <= 10,000

      THEN Approve with 25% down payment

     ELSE (i.e. loan amount requested is > 10,000) Request for a Co-Signer

    ELSE Approve the loan (i.e. probability of default is < 28%)

It should be obvious from this example that analytics models and the predictions alone are not enough to get value out of analytics. The actual business decisions require additional consideration of various unique situations and scenarios. In a user-centric decision environment, this would be left to an employee to weigh in on gray areas and use good judgment. However, that introduces subjectivity and differences in approaches leads to inconsistent decisions and results when analyzed over time.

It is important to note here that the performance of the analytics model is measured on the results taken within the decision strategy. If the decision strategy has inconsistent and subjective decisions, their outcome can incorrectly evaluate the performance of the model. Some employee may make good judgment calls on what decisions to make and hide the weaker predictions from the model and vice versa. Therefore, it is important to use automated decision strategies and implement the subjective rules into absolute values so the variation in decision strategies is eliminated when evaluating models.

The performance of a predictive model is measured using the results of decision strategies, therefore it is important to isolate their influence on models’ outputs.

Decision Variables and Cutoffs

This example uses the following DVs:

1. Probability of default (with values 0.28 and 0.62)

2. Loan amount requested (with a value of 10,000)

the probability of default DV comes from the risk predictive model, while the loan amount requested DV is part of the business transaction where analytics is being applied for a decision. The values for the decision variables have been analytically derived. High default risk means “Bad” customer so the definition of Bad is basically greater than 62% probability of default. This definition has been arrived at looking at historical data and reviewing past defaults. The same goes for the other values within this decision strategy. Figure 5.1 depicts a visual representation of the decision strategy.

image

Figure 5.1 Representation of the decision strategy.

Insurance Claims

The insurance claim example we use here has multiple possible decisions unlike the loan strategy that had only two possible decisions, approve or reject the loan. The insurance industry is perhaps the oldest user of analytics with actuaries trained and specializing in the discipline of assessing future outcomes of various events. Different types of insurance, such as property/casualty, life, and healthcare, all have a similar problem of calculating the probability and size of a claim on an insurance policy. This calculation dictates how much premium to charge a customer. A predictive model can be built that can predict the probability of a claim on a policy, but the size of the claim will be assessed using the decision strategy, which will lead to a decision on how much premium to charge. Figure 5.2 shows a decision strategy that uses a predictive model to provide the probability of making a claim on a policy, and then a set of rules to determine the premium.

image

Figure 5.2 Decision strategy using a predictive model.

Decision Strategy in Descriptive Models

In descriptive models, decision strategies are still needed to address the gray area introduced by descriptive models. In case of outlier detection using clustering, the idea is to supply a large data set to a clustering algorithm and it will plot the data points and look for points that are close to form a cluster. Once the clusters are formed and their definitions have been identified, the descriptive model is ready. When a new observation comes in, it is determined which cluster it is part of or if it is close to a particular cluster. A decision strategy or the clustering software can compute that looking at the ranges of variables within each cluster and the values on the new observation. Once it is determined whether the new observation is in the cluster, the expected future behavior of the new observation will be the same as the rest of the population of the cluster, and therefore a decision can be made about decision on the new observation. On the other hand, if it lies outside a particular cluster, then it must be determined how to apply the insight of the cluster it is closest to. The Euclidean distances of “in the cluster,” “outside the cluster,” and “how far outside the cluster” are all gray areas that need a decision strategy to determine how to treat them.

Similarly, in a social network analysis, if the relationship between two customers is too strong, a certain type of decision is warranted, however, if the relationship is somewhat strong, then a different strategy is required. This again introduces a degree of gray area that requires business rules–based decision strategies. The decision strategies may be a little more volatile initially and they may go through several iterations and adjustments before getting finalized. Apart from that gray area in a model’s output, everything that has been covered in predictive decision strategies applies to descriptive analytics–based decision strategies, as the actions on the analytics’ models are still a set of DVs, their thresholds, and business decisions.

Decision Automation and Intelligent Systems

Now that we have seen the process of how operations run, generate data, and the data is stored and analyzed, analytics models come into play. The use of analytics models is through decision strategies. Now let’s look at how these strategies go out of the analytics team and into the real world so actual business operations benefit from analytics—the culmination of value from data through democratization of analytics.

Learning versus Applying

The purpose of data warehouse and analytics systems is to analyze data, learn from that analysis, and use that knowledge and insight to optimize the business operations. The optimization of business operations really means changing the business processes so they yield greater value for the organization. The business processes are actually automated through operational systems, so these changes require operational systems changes. However, since we established in Chapter 1 that analytics has to do with a future event, the business process changes are limited to well-defined events and the response to those events. Just to ensure that there is no confusion, modifying or improving a business process can also mean improving the information flow in a process, the integration of various disconnected processes, or eliminating redundant or duplicate steps from the flow. That is, the field of business process management has nothing to do with analytics models and decision strategies. The business process optimization, redesign, or reengineering within the purview of analytics is limited to the automated decisions driven from business rules that take input from analytics models. These business rules are embedded in the operational system, and therefore the application of analytics input has to be implemented within the operational system or as an add-on component or extension of the operational system.

Decision automation has two dimensions: the learning and the application of that learning on business operations. The learning is where the data warehouse, data analysis, and analytics models come into play, while applying that learning on actual business activity is where decision strategies come into play, and they have to be embedded or tightly integrated into the operational system. There is a school of thought known as active data warehousing that suggests that this decision making should be done in the data warehouse since all relevant data is available to make a determination comprehensively, from all perspectives. This requires the data warehouse to be connected in real time to the operational system for receiving the event and responding with a decision. This just increases the complexity of integration, as the rules or knowledge used to make the decision must have been done beforehand, so why integrate with the data warehouse in real time? The active data warehousing approach works well in campaign management systems where the line between operational data and analytical data is blurred. It is not a recommended approach when additional layers of analytics are involved. If the decisions are totally based on some triggers or thresholds that the data warehouse is tracking in real time, then active data warehousing may work. The strategy integration is a superior and simpler approach and decouples the data warehouse from live events and decisions. The results still go in the data warehouse within the analytics datamart, but there is tighter monitoring control and a simpler interface for strategy modification and testing.

Figure 5.3 represents this learning versus applying. If we break this diagram in the left side (going top to bottom), that would be the learning dimension, and the right side of the diagram would be the applying dimension. From a system architecture perspective, the nature of the two is quite different, and therefore a modular approach allows for greater flexibility in choice of toolset, choice of monitoring and control, and the operational SLAs to support the business.

image

Figure 5.3 Decision strategy technical architecture.

Strategy Integration Methods

Earlier in this chapter we used a simple decision strategy example for a consumer car loan. That example was presented in its algorithmic form (a combination of nested IF_THEN_ELSE), as well as in visual form (see Figure 5.1). Looking at that simple business rule it should be obvious that adding that extended logic in the operational system workflow is not a technical challenge at all. The analytics tool will always stay outside as a black-box or a standalone component. However, if there are several scenarios that need their own strategies or there is a complex situation where multiple models are also involved, this type of extended coding within the operational system may get too complicated to maintain.

On the other hand, a pure-play fancy strategy management tool may be too expensive. Therefore, one method is to embed the strategy rules in the operational system software. Another is to buy a strategy tool. The downside of embedding the code is that monitoring and auditing, as well as modifications, testing, and what-if scenarios, will be extremely difficult to carry out. The strategy tools do solve this problem as they have visual interfaces for strategy design and internal version controls, but there is the integration with the analytics tool, data warehouse, and operational system that has its own costs and implementation overheads. Here is an innovative approach as an alternate method to the two methods described earlier.

ETL to the Rescue

ETL (extract, transform, and load) is a small industry comprising specialized software, comprehensive data management methodology, and human expertise existing within the data warehouse industry. It came into being with the data warehouse since data needed to be pulled out of the operational systems and loaded into the data warehouse systems. We refer to ETL as a noun encompassing all aspects of “data in motion,” including single record or large data sets, messaging or files, real-time or batch data. ETL then becomes the glue that holds the entire analytics solution together.

All ETL tools now have GUI-based development environments and provide all the capabilities of a modern software development tool. If we look closely to the two treelike strategies in Figures 5.1 and 5.2, they look very similar to how a data processing or dataflow program looks in ETL tools. Therefore, an ETL tool can be used to design and implement strategies. ETL has its own processing server and has integration interfaces to all source systems; there is plenty of expertise available within all data warehouse teams. So the recommended integration is through an ETL layer that receives a real-time event from the operational system, it prepares the input record around the event, and invokes the analytics model. The model returns an output that ETL will take into the strategy, run the data through the strategy, and reach a decision. It will then deposit that decision back into the operational system. This is a very simple integration for ETL developers who have been building the information value chain through the Information Continuum. Remember, if the prerequisite layers of the hierarchy are not in place, jumping into the analytics levels of the Information Continuum will only yield short-term and sporadic success. For a sustained benefit from analytics, the Information Continuum has to be followed through each level, and that will automatically bring the ETL maturity needed to implement decision strategies and integrate with the operational systems.

Strategy Evaluation

The evaluation of decision strategies serves two purposes. One is to validate the automated decision making when it is being put in place for the first time and the second is when an alternate and competing approach is being tested. There are two primary methods for decision strategy evaluation: retrospective processing and reprocessing.

Retrospective Processing

The retrospective processing method is identical to the approach used in model validation and covered in Chapter 4, and therefore uses the same name. The historical data, such as customers or transactions where the strategy should’ve been applied, already have a known outcome. If the business value from those prestrategy decisions on such transactions is known, provided the same model was used, then we can do a fair comparison by running those records through the strategy and, based on the model’s output, reach a decision. We can then compare the business value of these decisions.

Reprocessing

A more prudent approach is to run the same event or transaction through the decision process that is currently in force. It could be a champion strategy in place, or it could be some subjective decision rules driving the decisions on those events. At the same time, randomly select 10% of the transactions and run through the new decision strategy and save the results. After some time, when the true benefit of the decision is evident and measurable, compare the results of the two approaches. More robust decision strategy systems, such as the ones available from SAS or SPSS for general-purpose decision strategy management or specialized ones from Fair Isaac (FICO) or Experian designed for consumer lending business, have the ability to create a simulation environment to even test what-if scenarios with manipulation of strategies. For example, if the total cost of the decision strategy used to keep customers from defecting through discount offers is $3 million and the value of their retained business is $4 million, then a simulated environment can show what that number would be if the criteria or the value of the discounts was adjusted. If the cost of the simulated decision strategy comes out to be $2.7 million while the retained value is $3.9 million, then it is possible this discount strategy should not be used. This means that the criteria for offering discounts was changed in the simulated strategy and not all the customers got the discount as before, but yet again the benefit was almost the same.

Champion–Challenger Strategies

Business should be monitoring execution and effectiveness of strategies on a regular basis. The analytics datamart that stores the events or transactions that came in and were acted upon, directed by the decisions coming from the strategy, are recorded with an intimate level of detail, including the input, output, and the path it took down the strategy decision tree. Reports running against this data should allow users at all levels in and around the business process to look for new ways to use the analytical insight and constantly improve the processes. It may not be a bad idea to mandate that all concerned should come up with challenger strategies (may be incentivize as an internal competition) and at least one challenger strategy should be selected to compare against the champion strategy currently in force. The strategy evaluation methods can be used to evaluate the performance of the challenger strategy and, if it yields superior results, it should get utilized. The employee responsible for that strategy should be recognized and compensated accordingly.

Business Process Innovation

The constant and sustainable cycle of innovative strategies will improve business processes all across the organization with simultaneous benefits of operational excellence, product innovation, and customer intimacy. The defining impact of democratization of analytics on a business is that innovation is no longer top-down driven from a handful of brilliant executives. Rather, all employees across all aspects of the business are able to improve the business operation they are responsible for. Procurement, supply chain, HR, sales, marketing, and customer service all get their own models, their own strategies, their own process of champion–challenger models and strategies, and an innovation and reward culture. That is how we enter, survive, and excel in this era of Big Data.

Earlier in this chapter we briefly reviewed the widely touted analytics approach of using a data scientist and explained some of the challenges of that approach. The last few sections in this chapter have provided a detailed explanation of an alternate approach—democratization of analytics. It is up to the individual organizations to understand the intellectual capital within their technical and business teams and see if a data scientist approach is feasible for them. For some handful of specialized areas it will always be a superior approach, but the strength and depth of democratization of analytics cannot be ignored. An organization that can adopt the culture of data and analytics across all their business processes can certainly stay a step ahead of competition and may operate with more efficiency, produce improved products, and serve their customers better by relying on their data and the collective genius of their employees.

Exercises

Following are a couple of problems that anyone should be able to attempt even without formal domain knowledge.

5.1 Student drop-out prevention. Build a decision strategy that uses a predictive model that computes the likelihood of a student dropping out. The strategy should be designed to prevent the student from dropping out.

5.2 Churn prevention. Build a decision strategy that uses a predictive model that computes the likelihood of a customer defecting (leaving a monthly subscription service like cable TV, cell phone, etc.). The strategy should be designed to keep the customer from defecting.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset