CHAPTER 9

Business Analytic Modeling Tools

This book began with discussion of general business decision making, knowledge management, and views of business analytics. Business analytics is not only about quantitative modeling—it includes consideration of data, management information systems, and acquisition of all forms of knowledge needed by businesses, to include tacit (undocumented; unspoken; implicit but not well-defined) knowledge. This book has focused on modeling tools as means to implementing business analytics in various important business decision contexts. We will conclude with an overview of the era of big data, data mining, and a review of the business analytic modeling tools we have covered in prior chapters.

The Era of Big Data

Our society has linked all of us in many ways, generating massive quantities of data from social media, the Internet of things, and many other sources. Data mining refers to the analysis of large quantities of data that are stored in computers. Bar coding has made checkout very convenient for us and provides retail establishments with masses of data. Grocery stores and other retail stores are able to quickly process our purchases and use computers to accurately determine the product prices. These same computers can help the stores with their inventory management, by instantaneously determining the quantity of items of each product on hand. Computers allow the store’s accounting system to more accurately measure costs and determine the profit that store stockholders are concerned about. All of this information is available based on the bar coding information attached to each product. Along with many other sources of information, information gathered through bar coding can be used for data mining analysis.

The era of big data in here, with many sources pointing out that more data are created over the past year or two than was generated throughout all prior human history. Big data involves datasets so large that traditional data analytic methods no longer work due to data volume. Big data is often described as:

Data too big to fit on a single server

Data too unstructured to fit in a row-and-column database

Data flowing too continuously to fit into a static data warehouse

Lack of structure is the most important aspect (even more than the size)

The point is to analyze, converting data into insights, innovation, and business value

The era of big data is expected to emphasize focusing on knowing what (based on correlation) rather than the traditional obsession for causality. The emphasis will be on discovering patterns offering novel and useful insights. Data will become a raw material for business, a vital economic input and source of value. Some new characteristics of this data include:

1. There is so much data available that sampling is usually not needed (n=all).

2. Precise accuracy of data is, thus, less important as inevitable errors are compensated for by the mass of data (any one observation is flooded by others).

3. Correlation is more important than causality—most data mining applications involving big data are interested in what is going to happen, and you don’t need to know why.

Automatic trading programs need to detect the trend changes, not figure out that the Greek economy collapsed or the Chinese government will devalue the Renminbi (RMB). The programs in vehicles need to detect that an axle bearing is getting hot and the vehicle is vibrating and the wheel should be replaced, not whether this is due to a bearing failure or a housing rusting out.

There are many sources of big data. Internal to the corporation, e-mails, blogs, enterprise systems, and automation lead to structured, unstructured, and semistructured information within the organization. External data is also widely available, much of it free over the Internet, but much also available from the commercial vendors. There also is data obtainable from social media.

Data Mining

Data mining is not limited to business. Both major parties in the U.S. elections utilize data mining of potential voters. Data mining has been heavily used in the medical field, from diagnosis of patient records to help identify the best practices. Business use of data mining is also impressive. Toyota used data mining of its data warehouse to determine more efficient transportation routes, reducing the time to deliver cars to their customers by an average 19 days. Data warehouses are very large scale database systems capable of systematically storing all transactional data generated by a business organization, such as Walmart. Toyota also was able to identify the sales trends faster and to identify the best locations for new dealerships.

Data mining is widely used by banking firms in soliciting credit card customers, by insurance and telecommunication companies in detecting fraud, by manufacturing firms in quality control, and many other applications. Data mining is being applied to improve food product safety, criminal detection, and tourism. Micromarketing targets small groups of highly responsive customers. Data on consumer and lifestyle data is widely available, enabling customized individual marketing campaigns. This is enabled by customer profiling, identifying those subsets of customers most likely to be profitable to the business, as well as targeting, determining the characteristics of the most profitable customers.

Data mining involves statistical and artificial intelligence (AI) analysis, usually applied to large-scale datasets. There are two general types of data mining studies. Hypothesis testing involves expressing a theory about the relationship between actions and outcomes. This approach is referred to as supervised. In a simple form, it can be hypothesized that advertising will yield greater profit. This relationship has long been studied by retailing firms in the context of their specific operations. Data mining is applied to identifying relationships based on large quantities of data, which could include testing the response rates to various types of advertising on the sales and profitability of specific product lines. However, there is more to data mining than the technical tools used. The second form of data mining study is knowledge discovery. Data mining involves a spirit of knowledge discovery (learning new and useful things). Knowledge discovery is referred to as unsupervised. In this form of analysis, a preconceived notion may not be present, but rather relationships can be identified by looking at the data. This may be supported by visualization tools, which display data, or through fundamental statistical analysis, such as correlation analysis. Much of this can be accomplished through automatic means, as we will see in decision tree analysis, for example. But data mining is not limited to automated analysis. Knowledge discovery by humans can be enhanced by graphical tools and identification of unexpected patterns through a combination of human and computer interaction.

Business Analytics Modeling

Management science is the application of scientific approaches to managerial decision problems. These models apply mathematics to enable decision makers or analysts to experiment with decision components, ideally seeking optimal decisions. Management science models play a role in the broader field of knowledge management.

Knowledge management seeks to put what humans observe into context, using the information obtained to try to identify patterns or cause-and-effect relationships between actions business decision makers can take and their intended results. By trying to be systematic and objective, humans seek to gain understanding, and identify better results from their decision making.

The rational management decision process seeks to identify the system of relationships in business operations, trying to be as scientific as possible in order to objectively improve operations. A rational decision process might include:

Defining the decision problem, or need to take action (identify the mission)

Search for data and/or solutions (management information systems are one source)

Generate alternative means to accomplish the mission

Analyze (it is here that the models presented in this book might be applied)

Select action

Implement the action selected

Note that modeling is by no means the entire story. The models we have presented are only tools that can support efforts to make better decisions.

Model Classification

We can conclude with a classification of the tools that we have discussed. Table 9.1 lists a broad classification, with general function for each modeling type.

After an initial discussion of knowledge management in Chapter 1, this book discussed some visualization tools in Chapter 2. These tools are important in gaining personal understanding, as well as in communicating to others in the organization, seeking a shared basis for coordinated efforts. Chapters 3 and 4 covered some fundamental statistical tools that have been used to monitor operations quality, as well as generally support formal testing of hypotheses. Not all modeling tools were covered in the book by any means. There is an entire field of useful probabilistic analytic models such as queuing theory, Markov chains, and other tools that utilize distributions of data. Regression was covered in Chapters 5 through 7. Chapter 5 covered simple regression, focusing on time series forecasting. Chapter 6 covered multiple regression, useful in analyzing cause-and-effect relationships. Logistic regression in Chapter 7 is a fundamental tool used in data mining, where many problems involve dealing with classification of dependent variables into discrete groups. Related to probabilistic analysis is simulation, where instead of analytic models, numerical analysis based on random number streams are widely used for problems involving probability distributions. This book does not have the space to cover either of these interesting and useful tools. But the ideal goal of optimization was covered in Chapter 8, at least with the simpler version of linear programming models. There are more complex forms of optimization models not covered. Another type of model not covered is game theory, which tries to analyze competitive environments where the actions of others affect the optimality of solutions.

Table 9.1 Business analytic modeling tool classification

Model type

Function

Book coverage

Visualization

Describe

Chapter 2

Statistics

Measure

Chapters 3 and 4

Probabilistic analysis

Problems with distributions

Other books—analytic models such as queuing analysis

Forecasting

Prediction

Chapter 5

Multiple regression

Cause-and-effect

Chapter 6

Logistic regression

Classification—data mining

Chapter 7

Simulation

Risk

Other books—risk management

Optimization

Identify best solution

Chapter 8

Game theory

Competitive environments

Other books

Thus we admit that we have not covered all models, nor have we covered all that is important in knowledge management. But we have covered some useful analytic tools that can be applied to aid the overall decision process in business.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset