CHAPTER 18
Multivariate Data Analysis

Photo illustration of a young man working on his computer at his desk. His desk is filled with computers, headphone, laptops, audio system among other items.

This chapter continues our review of methods for data analysis, ways and tools that we can use to get data to give up the insights that we are seeking and that will enable us to make better decisions. The tools we will discuss in this chapter are parsimonious in output but huge in insight. Just a few statistics from each tool give us the direction we need. Used properly, these tools are very efficient and powerful.

Multivariate Analysis Procedures

The term multivariate analysis refers to the simultaneous analysis of multiple measurements on each individual or object being studied.1 Some experts consider any simultaneous statistical analysis of more than two variables to be multivariate analysis. Multivariate analysis procedures are extensions of the univariate and bivariate statistical procedures discussed in Chapters 16 and 17.

A number of techniques fall under the heading of multivariate analysis procedures. In this chapter, we will consider five of these techniques:

  • Multiple regression analysis
  • Multiple discriminant analysis
  • Cluster analysis
  • Factor analysis
  • Conjoint analysis

You may have been exposed to multiple regression analysis in introductory statistics courses. The remaining procedures are less widely studied. Summary descriptions of the techniques are provided in Exhibit 18.1.

EXHIBIT 18.1 Brief Descriptions of Multivariate Analysis Procedures

Multiple regression analysis Enables the researcher to predict the level of magnitude of a dependent variable based on the levels of more than one independent variable.
Multiple discriminant analysis Enables the researcher to predict group membership on the basis of two or more independent variables.
Cluster analysis Is a procedure for identifying subgroups of individuals or items that are homogeneous within subgroups and different from other subgroups.
Factor analysis Permits the analyst to reduce a set of variables to a smaller set of factors or composite variables by identifying underlying dimensions in the data.
Conjoint analysis Provides a basis for estimating the utility that consumers associate with different product features or attributes.

Although awareness of multivariate techniques is far from universal, they have been around for decades and have been widely used for a variety of commercial purposes. Fair Isaac & Co. has built a $740 million business around the commercial use of multivariate techniques.4 The firm and its clients have found that they can predict with surprising accuracy who will pay their bills on time, who will pay late, and who will not pay at all. The federal government uses secret formulas, based on the firm’s analyses, to identify tax evaders. Fair Isaac has also shown that results from its multivariate analyses help in identifying the best sales prospects.

Multivariate Software

The computational requirements for the various multivariate procedures discussed in this chapter are substantial. As a practical matter, running the various types of analyses presented requires a computer and appropriate software. Until the late 1980s, most types of multivariate analyses discussed in this chapter were done on mainframe or minicomputers because personal computers were limited in power, memory, storage capacity, and range of software available. Those limitations are in the past. Personal computers have the power to handle just about any problem that a marketing researcher might encounter. Most problems can be solved in a matter of seconds, and a wide variety of outstanding software is available for multivariate analysis. SPSS is the most widely used by professional marketing researchers.

SPSS includes a full range of software modules for integrated database creation and management, data transformation and manipulation, graphing, descriptive statistics, and multivariate procedures. It has an easy-to-use, graphical interface. Additional information on the SPSS product line can be found at http://www.spss.com/software/statistics and http://www.spss.com/software/modeler. A number of other useful resources are available at the SPSS site:

  • Technical support, product information, FAQs (frequently asked questions), various downloads, and product reviews
  • Examples of successful applications of multivariate analysis to solve real business problems
  • Discussions of data mining and data warehousing applications

As we move into discussing analytical techniques, don’t forget that we’ve got to capture the data first to feed our models. This is often the bigger challenge, as discussed in the Practicing Marketing Research feature from the banking industry on page 446. This is particularly true when we move into the realm of Big Data. We employ many of the same analytical techniques discussed in this chapter, but first we’ve got to capture the data and get the information in a form we can use. More on this issue later in the chapter.

Great Western can now assign a numeric digit for onboarding a particular account, such as a consumer checking or a small business account, and fully understand the cost of servicing that account.

Multiple Regression Analysis

Researchers use multiple regression analysis when their goal is to examine the relationship between two or more metric predictor (independent) variables and one metric dependent (criterion) variable.6 Under certain circumstances, described later in this section, nominal predictor variables can be used if they are recoded as binary variables.

Multiple regression analysis is an extension of bivariate regression, discussed in Chapter 17. Instead of fitting a straight line to observations in a two-dimensional space, multiple regression analysis fits a plane to observations in a multidimensional space. The output obtained and the interpretation are essentially the same as for bivariate regression. The general equation for multiple regression is as follows:

For example, consider the following regression equation (in which values for a, b1, and b2 have been estimated by means of regression analysis):

This equation indicates that sales increase by 17 units for every $1 increase in advertising and 22 units for every one-unit increase in number of salespersons.

Applications of Multiple Regression Analysis

There are many possible applications of multiple regression analysis in marketing research:

  • Estimating the effects of various marketing mix variables on sales or market share.
  • Estimating the relationship between various demographic or psychographic factors and the frequency with which certain service businesses are visited.
  • Determining the relative influence of individual satisfaction elements on overall satisfaction.
  • Quantifying the relationship between various classification variables, such as age and income, and overall attitude toward a product or service.
  • Determining which variables are predictive of sales of a particular product or service.

Multiple regression analysis can serve one or a combination of two basic purposes: (1) predicting the level of the dependent variable, based on given levels of the independent variables, and (2) understanding the relationship between the independent variables and the dependent variable.

Multiple Regression Analysis Measures

In the discussion of bivariate regression in Chapter 17, a statistic referred to as the coefficient of determination, or R2, was identified as one of the outputs of regression analysis. This statistic can assume values from 0 to 1 and provides a measure of the percentage of the variation in the dependent variable that is explained by variation in the independent variables. For example, if R2 in a given regression analysis is calculated to be .75, this means that 75 percent of the variation in the dependent variable is explained by variation in the independent variables. The analyst would always like to have a calculated R2 close to 1. Frequently, variables are added to a regression model to see what effect they have on the R2 value.

As models get larger, more independent or predictor variables, it is wise to look at a variation of the R2 statistic called adjusted R2, as the measure of fit for a regression model. The standard R2 value tends to increase with every predictor variable that is added to the model, regardless of whether that variable truly adds to the explanatory power of the model. The adjusted R2 corrects the coefficient of determination based on the relationship between the number of predictor variables and the overall sample size, producing a more rational estimate of model fit when several independent variables are included. The adjusted R2 will always be less than or equal to R2, being similar to the standard measure when the amount of sample per independent variable is large and producing a negative result when the sample size is very small and there are many predictors included in the model.

The b values, or regression coefficients, are estimates of the effect of individual independent variables on the dependent variable. It is appropriate to determine the likelihood that each individual b value is the result of chance. This calculation is part of the output provided by virtually all statistical software packages. Typically, these packages compute the probability of incorrectly rejecting the null hypothesis of bn = 0.

Dummy Variables

In some situations, the analyst needs to include nominally scaled independent variables such as gender, marital status, occupation, and race in a multiple regression analysis.

Photo illustration of three garden cleaners working to clean a garden.
Multiple regression analysis can be used to estimate the relationship between various demographic or psychographic factors and the frequency with which a service business is hired.

Dummy variables can be created for this purpose. Dichotomous, nominally scaled independent variables can be transformed into dummy variables by coding one value (e.g., female) as 0 and the other (e.g., male) as 1. For nominally scaled independent variables that can assume more than two values, a slightly different approach is required. If there are K categories, K − 1 dummy variables are needed to uniquely identify every category. (Including K categories would over identify the model since the last category is represented by “0s” on the previous K − 1 variables.) Consider a question regarding racial group with three possible answers: African American, Hispanic, or Caucasian. Binary or dummy variable coding of responses requires the use of two dummy variables, X1 and X 2, which might be coded as follows:

X 1 X 2
If person is African American 1 0
If person is Hispanic 0 1
If person is Caucasian 0 0

Potential Use and Interpretation Problems

The analyst must be sensitive to certain problems that may be encountered in the use and interpretation of multiple regression analysis results. These problems are summarized in the following sections.

Collinearity

One of the key assumptions of multiple regression analysis is that the independent variables are not correlated (collinear) with each other.7 If they are correlated, the predicted Y value is unbiased, and the estimated B values (regression coefficients) will have inflated standard errors and will be inaccurate and unstable. Larger than expected coefficients for some b values are compensated for by smaller than expected coefficients for others. This is why you still produce reliable estimates of Y and why you can get sign reversals and wide variations in b values with collinearity, but still produce reliable estimates of Y.

The simplest way to check for collinearity is to examine the matrix showing the correlations between each variable in the analysis. One rule of thumb is to look for correlations between independent variables of .30 or greater. If correlations of this magnitude exist, then the analyst should check for distortions of the b values. One way to do this is to run regressions with the two or more collinear variables included and then run regressions again with the individual variables. The b values in the regression with all variables in the equation should be similar to the b values computed for the variables run separately.

A number of strategies can be used to deal with collinearity. Two of the most commonly used strategies are (1) to drop one of the variables from the analysis if two variables are heavily correlated with each other and (2) to combine the correlated variables in some fashion (e.g., create an index or use factor analysis to combined related variables) to form a new composite independent variable, which can be used in subsequent regression analyses.

Causation

Although regression analysis can show that variables are associated or correlated with each other, it cannot prove causation. Causal relationships can be confirmed only by other means (see Chapter 9). A strong logical or theoretical basis must be developed to support the idea that a causal relationship exists between the independent variables and the dependent variable. However, even a strong logical base and supporting statistical results demonstrating correlation are only indicators of causation.

Standardizing Regression Coefficients

The magnitudes of the regression coefficients associated with the various independent variables can be compared directly only if the scaling of coefficients is in the same units or if the data have been standardized. Consider the following example:

At first glance, it appears that an additional dollar spent on advertising and another salesperson added to the salesforce have equal effects on sales. However, this is not true because X1 and X2 are measured in different kinds of units. Direct comparison of regression coefficients requires that all independent variables be measured in the same units (e.g., dollars or thousands of dollars) or that the data be standardized. Standardization is achieved by taking each number in a series, subtracting the mean of the series from the number, and dividing the result by the standard deviation of the series. This process converts any set of numbers to a new set with a mean of 0 and a standard deviation of 1. The formula for the standardization process is as follows:

Sample Size

The value of R2 is influenced by the number of predictor variables relative to sample size.8 Several different rules of thumb have been proposed; they suggest that the number of observations should be equal to at least 10 to 15 times the number of predictor variables. For the preceding example (sales volume as a function of advertising expenditures and number of salespersons) with two predictor variables, a minimum of 20 to 30 observations would be required.

Multiple Discriminant Analysis

Although multiple discriminant analysis is similar to multiple regression analysis,9 there are important differences. In the case of multiple regression analysis, the dependent variable must be metric; in multiple discriminant analysis, the dependent variable is nominal or categorical in nature. For example, the dependent variable might be usage status for a particular product or service. A particular respondent who uses the product or service might be assigned a code of 1 for the dependent variable, and a respondent who does not use it might be assigned a code of 2. Independent variables might include various metric measures, such as age, income, and number of years of education. The goals of multiple discriminant analysis are as follows:

  • Determine if there are statistically significant differences between the average discriminant score profiles of two (or more) groups (in this case, users and nonusers).
  • Establish a model for classifying individuals or objects into groups on the basis of their values on the independent variables. The resulting matrix is called a classification matrix.
  • Determine how much of the difference in the average score profiles of the groups is accounted for by each independent variable.

The general discriminant analysis equation follows:

The discriminant score, usually referred to as the Z score, is the score derived for each individual or object by means of the equation. This score is the basis for predicting the group to which the particular object or individual belongs. Discriminant weights, often referred to as discriminant coefficients, are computed by means of the discriminant analysis program. The size of the discriminant weight (or coefficient) associated with a particular independent variable is determined by the variance structure of the variables in the equation. Independent variables with large discriminatory power (large differences between groups) have large weights and those with little discriminatory power have small weights.

The goal of discriminant analysis is the prediction of a categorical variable. The analyst must decide which variables would be expected to be associated with the probability of a person or object falling into one of two or more groups or categories. In a statistical sense, the problem of analyzing the nature of group differences involves finding a linear combination of independent variables (the discriminant function) that shows large differences in group means. Multiple discriminant analysis outperforms multiple regression analysis in some applications where they are both appropriate.

Applications of Multiple Discriminant Analysis

Discriminant analysis can be used to answer many questions in marketing research:

  • How are consumers who purchase various brands different from those who do not purchase those brands?
  • How do we target likely buyers for a new product from our database of existing customers in order to conduct the most effective prelaunch marketing campaign?
  • How do consumers who frequent one fast-food restaurant differ in demographic and lifestyle characteristics from those who frequent another fast-food restaurant?
  • How do consumers who have chosen either indemnity insurance, HMO coverage, or PPO coverage differ from one another in regard to healthcare use, perceptions, and attitudes?

Cluster Analysis

The term cluster analysis generally refers to statistical procedures used to identify objects or people that are similar in regard to certain variables or measurements. The purpose of cluster analysis is to classify objects or people into some number of mutually exclusive and exhaustive groups so that those within a group are as similar as possible to one another (this is true in general, but techniques such as fuzzy clustering compute probabilities of membership rather than assigning records uniquely to a single group).10 In other words, clusters should be homogeneous internally (within cluster) and heterogeneous externally (between clusters).

Procedures for Clustering

A number of different procedures (based on somewhat different mathematical and computer routines) are available for clustering people or objects. Some examples of clustering techniques include K-means, two-stage, nearest neighbor, decision trees, ensemble analysis, random forest, BIRCH, and self-organizing neural networks. However, the general approach underlying all of these procedures involves measuring the similarities among people or objects in regard to their values on the variables used for clustering.11 Similarities among the people or objects being clustered are normally determined on the basis of some type of distance measure. This approach is best illustrated graphically. Suppose an analyst wants to group, or cluster, consumers on the basis of two variables: monthly frequency of eating out and monthly frequency of eating at fast-food restaurants. Observations on the two variables are plotted in a two-dimensional graph in Exhibit 18.2. Each dot indicates the position of one consumer in regard to the two variables. The distance between any pair of points is positively related to how similar the corresponding individuals are when the two variables are considered together (the closer the dots, the more similar the individuals). In Exhibit 18.2, consumer X is more like consumer Y than like either Z or W.

Photo illustration of Cluster Analysis Based on Two Variables.

EXHIBIT 18.2 Cluster Analysis Based on Two Variables

Inspection of Exhibit 18.2 suggests that three distinct clusters emerge on the basis of simultaneously considering frequency of eating out and frequency of eating at fast-food restaurants:

  • Cluster 1 includes those people who do not frequently eat out or frequently eat at fast-food restaurants.
  • Cluster 2 includes consumers who frequently eat out but seldom eat at fast-food restaurants.
  • Cluster 3 includes people who frequently eat out and also frequently eat at fast-food restaurants.

The fast-food company can see that its best targets are to be found among those who, in general, eat out frequently and eat at fast-food restaurants specifically. To provide more insight for the client, the analyst should develop demographic, psychographic, and behavioral profiles of consumers in cluster 3.

Photo illustration of a couple seated near a window dining at a restaurant.
Clustering people according to how frequently and where they eat out is a way of identifying a particular consumer base. An upscale restaurant can see that its customers fall into cluster 2 and possibly cluster 3 in Exhibit 18.2.

As shown in Exhibit 18.2, clusters can be developed from scatterplots. However, this time-consuming, trial-and-error procedure becomes more tedious as the number of variables used to develop the clusters or the number of objects or persons being clustered increases. You can readily visualize a problem with two variables and fewer than 100 objects. Once the number of variables increases to three and the number of observations increases to 500 or more, visualization becomes impossible. Fortunately, computer algorithms are available to perform this more complex type of cluster analysis. The mechanics of these algorithms are complicated and beyond the scope of this discussion. The basic idea behind most of them is to start with some arbitrary cluster boundaries and modify the boundaries until a point is reached where the average interpoint distances within clusters are as small as possible relative to average distances between clusters.

Additional discussion of using cluster analysis for market segmentation is provided in the accompanying Practicing Marketing Research feature.

Factor Analysis

The purpose of factor analysis is data simplification.12 The objective is to summarize the information contained in a large number of metric measures (e.g., rating scales) with a smaller number of summary measures, called factors. As with cluster analysis, there is no dependent variable.

Many phenomena of interest to marketing researchers are actually composites, or combinations, of a number of measures. These concepts are often measured by means of rating questions. For instance, in assessing consumer response to a new automobile, a general concept such as “luxury” might be measured by asking respondents to rate different cars on attributes such as “quiet ride,” “smooth ride,” or “plush carpeting.” The product designer wants to produce an automobile that is perceived as luxurious but knows that a variety of features probably contribute to this general perception. Each attribute rated should measure a slightly different facet of luxury. The set of measures should provide a better representation of the concept than a single global rating of “luxury.”

Several measures of a concept can be added together to develop a composite score or to compute an average score on the concept. Exhibit 18.3 shows data on six consumers who each rated an automobile on four characteristics. You can see that those respondents who gave higher ratings on “smooth ride” also tended to give higher ratings on “quiet ride.” A similar pattern is evident in the ratings of “acceleration” and “handling.” These four measures can be combined into two summary measures by averaging the pairs of ratings. The resulting summary measures might be called “luxury” and “performance” (see Exhibit 18.4).

EXHIBIT 18.3 Importance Ratings of Luxury Automobile Features

Respondent Smooth Ride Quiet Ride Acceleration Handling
Bob 5   4   2   1  
Roy 4   3   2   1  
Hank 4   3   3   2  
Janet 5   5   2   2  
Jane 4   3   2   1  
Ann 5   5   3   2  
Average 4.50 3.83 2.33 1.50

EXHIBIT 18.4 Average Ratings of Two Factors

Respondent Luxury Performance
Bob 4.5  1.5 
Roy 3.5  1.5 
Hank 3.5  2.5 
Janet 5.0  2.0 
Jane 3.5  1.5 
Ann 5.0  2.5 
Average 4.25 1.92

Factor Scores

Factor analysis produces one or more factors, or composite variables, when applied to a number of variables. A factor, technically defined, is a linear combination of variables. It is a weighted summary score of a set of related variables, similar to the composite derived by averaging the measures. However, in factor analysis, each measure is first weighted according to how much it contributes to the variation of each factor.

In factor analysis, a factor score is calculated on each factor for each subject in the data set. For example, in a factor analysis with two factors, the following equations might be used to determine factor scores:

With these formulas, two factor scores can be calculated for each respondent by substituting the ratings she or he gave on variables A1 through A4 into each equation. The coefficients in the equations are the factor scoring coefficients to be applied to each respondent’s ratings. For example, Bob’s factor scores (see Exhibit 18.4) are computed as follows:

In the first equation, the factor scoring coefficients, or weights, for A1 and A2 (.40 and .30) are large, whereas the weights for A3 and A4 are small. The small weights on A3 and A4 indicate that these variables contribute little to score variations on factor 1 (F1). Regardless of the ratings a respondent gives to A3 and A4, they have little effect on his or her score on F1. However, variables A3 and A4 make a large contribution to the second factor score (F2), whereas A1 and A2 have little effect. These two equations show that variables A1 and A2 are relatively independent of A3 and A4 because each variable takes on large values in only one scoring equation.

The relative sizes of the scoring coefficients are also of interest. Variable A1 (with a weight of .40) is a more important contributor to factor 1 variation than is A2 (with a smaller weight of .30). This finding may be very important to the product designer when evaluating the implications of various design changes. For example, the product manager might want to improve the perceived luxury of the car through product redesign or advertising. The product manager may know, based on other research, that a certain expenditure on redesign will result in an improvement of the average rating on “smooth ride” from 4.3 to 4.8. This research may also show that the same expenditure will produce a half-point improvement in ratings on “quiet ride.” The factor analysis shows that perceived luxury will be enhanced to a greater extent by increasing ratings on “smooth ride” than by increasing ratings on “quiet ride” by the same amount.

Factor Loadings

The nature of the factors derived can be determined by examining the factor loadings. Using the scoring equations presented earlier, a pair of factor scores (F1 and F2) are calculated for each respondent. Factor loadings are determined by calculating the correlation (from −1 to +1) between each factor (F1 and F2) score and each of the original ratings variables. Each correlation coefficient represents the loading of the associated variable on the particular factor. If A1 is closely associated with factor 1, the loading or correlation will be high, as shown for the sample problem in Exhibit 18.5. Because the loadings are correlation coefficients, values near −1 or +1 indicate a close positive or negative association. Variables A1 and A2 are closely associated (highly correlated) with scores on factor 1, and variables A3 and A4 are closely associated with scores on factor 2.

EXHIBIT 18.5 Factor Loadings for Two Factors

Correlation with
Variable Factor 1 Factor 2
A1 .85 .10
A2 .76 .06
A3 .06 .89
A4 .04 .79

Stated another way, variables A1 and A2 have high loadings on factor 1 and serve to define the factor; variables A3 and A4 have high loadings on and define factor 2.

Naming Factors

Once each factor’s defining variables have been identified, the next step is to name the factors. This is a somewhat subjective step, combining intuition and knowledge of the variables with an inspection of the variables that have high loadings on each factor. Usually, a certain consistency exists among the variables that load highly on a given factor. For instance, it is not surprising to see that the ratings on “smooth ride” and “quiet ride” both load on the same factor. Although we have chosen to name this factor “luxury,” another analyst, looking at the same result, might decide to name the factor “prestige.”

Number of Factors to Retain

In factor analysis, the analyst is confronted with a decision regarding how many factors to retain. The final result can include from one factor to as many factors as there are variables. The decision is often made by looking at the percentage of the variation in the original data that is explained by each factor.

There are many different decision rules for choosing the number of factors to retain. Probably the most appropriate decision rule is to stop factoring when additional factors no longer make sense. The first factors extracted are likely to exhibit logical consistency; later factors are usually harder to interpret, for they are more likely to contain a large amount of random variation.

Conjoint Analysis

Conjoint analysis is a popular multivariate procedure used by marketers to help determine what features a new product or service should include and how it should be priced. It can be argued that conjoint analysis has become popular because it is a more powerful, more flexible, and often less expensive way to address these important issues than is the traditional concept testing approach.13

Conjoint analysis is not a completely standardized procedure.14 A typical conjoint analysis application involves presenting various product or service combinations in a carefully controlled exercise, then estimating the relative value of each feature tested based on how people reacted to the different combinations presented. “Reactions” may be captured as rankings, rating, likelihood to purchase or by some other means depending on the approach being used. The type of conjoint approach (e.g., ratings-based, discrete choice, graded pairs, dual choice, full profile, partial profile, adaptive choice, etc.) affects how the exercise is presented and what statistical procedures are most appropriate for analyzing the results. Fortunately, conjoint analysis is not difficult to understand conceptually, as we demonstrate in the following example concerning the attributes of golf balls.

Example of Conjoint Analysis

Put yourself in the position of a product manager for Titleist, a major manufacturer of golf balls. From focus groups recently conducted, past research studies of various types, and your own personal experience as a golfer, you know that golfers tend to evaluate golf balls in terms of three important features or attributes: average driving distance, average ball life, and price.

You also recognize a range of feasible possibilities for each of these features or attributes, as follows:

  1. Average driving distance
    • 10 yards more than the golfer’s average
    • Same as the golfer’s average
    • 10 yards less than the golfer’s average
  2. Average ball life
    • 54 holes
    • 36 holes
    • 18 holes
  3. Price per ball
    • $2.00
    • $2.50
    • $3.00

From the perspective of potential purchasers, these attributes have a natural order (i.e., longer distance and longer ball life are always preferred over shorter options), so we can easily identify the ideal configuration. This is not always the case when dealing with attributes such as brand, physical appearance, or color. For this example, the consumer’s ideal golf ball would have the following characteristics:

  • Average driving distance—10 yards above average
  • Average ball life—54 holes
  • Price—$2.00

From the manufacturer’s perspective, which is based on manufacturing cost, the ideal golf ball would probably have these characteristics:

  • Average driving distance—10 yards below average
  • Average ball life—18 holes
  • Price—$3.00
Photo illustration of a golf ball near the hole. A gold player hands lifted above is seen in the background. He is holding a golf club in his right hand.
Conjoint analysis could be used by a manufacturer of golf balls to determine the relative importance of these three features of a golf ball and to see which ball meets the most needs of both consumer and manufacturer.

This golf ball profile is based on the fact that it costs less to produce a ball that travels a shorter distance and has a shorter life. The company confronts the eternal marketing dilemma: the company would sell a lot of golf balls, but would go broke if it produced and sold the ideal ball from the golfer’s perspective. However, the company would sell very few balls if it produced and sold the ideal ball from the manufacturer’s perspective. As always, the “best” golf ball from a business perspective lies somewhere between the two extremes.

A traditional approach to this problem might produce information of the type displayed in Exhibit 18.6. As you can see, this information does not provide new insights regarding which ball should be produced. The preferred driving distance is 10 yards above average and the preferred average ball life is 54 holes. These results are obvious without any additional research.

EXHIBIT 18.6 Traditional Nonconjoint Rankings of Distance and Ball Life Attributes

Average Driving Distance Average Ball Life
Rank Level Rank Level
1 275 yards 1 54 holes
2 250 yards 2 36 holes
3 225 yards 3 18 holes

Considering Features Conjointly

In conjoint analysis, rather than having respondents evaluate features individually, the analyst asks them to evaluate features conjointly or in combination so that advantages for one attribute can only be chosen at the expense of another attribute. The results of asking two different golfers to rank different combinations of “average driving distance” and “average ball life” conjointly are shown in Exhibits 18.7 and 18.8.

EXHIBIT 18.7 Conjoint Rankings of Combinations of Distance and Ball Life for Golfer 1

Ball Life
Distance 54 holes 36 holes 18 holes
275 yards 1 2 4
250 yards 3 5 7
225 yards 6 8 9

EXHIBIT 18.8 Conjoint Rankings of Combinations of Distance and Ball Life for Golfer 2

Ball Life
Distance 54 holes 36 holes 18 holes
275 yards 1 3 6
250 yards 2 5 8
225 yards 4 7 9

As expected, both golfers agree on the most and least preferred balls. However, analysis of their second through eighth rankings makes it clear that the first golfer is willing to trade off ball life for distance (accept a shorter ball life for longer distance), while the second golfer is willing to trade off distance for longer ball life (accept shorter distance for a longer ball life).

This type of information is the essence of the special insight offered by conjoint analysis. The technique permits marketers to see which product attribute or feature potential customers are willing to trade off (accept less of) to obtain more of another attribute or feature. People make these kinds of purchasing decisions every day (e.g., they may choose to pay a higher price for a product at a local market for the convenience of shopping there).

Estimating Utilities

The next step is to calculate a set of values, or utilities, for the three levels of price, the three levels of driving distance, and the three levels of ball life in such a way that, when they are combined in a particular mix of price, ball life, and driving distance, they predict each golfer’s rank order for that combination. Estimated utilities for golfer 1 are shown in Exhibit 18.9. As you can readily see, this set of numbers perfectly predicts the original rankings. The relationship among these numbers or utilities is fixed, though there is some arbitrariness in their magnitude or scale. In other words, the utilities shown in Exhibit 18.9 can be multiplied or divided by any constant and the same relative results will be obtained. Utilities for this simple example can be computed using ordinary least squares regression, but the exact procedures for estimating utilities of more complex exercises are beyond the scope of this discussion. They are normally calculated by using procedures related to regression, analysis of variance, linear programming, logic, or hierarchical Bayes analysis.

EXHIBIT 18.9 Ranks (in parentheses) and Combined Metric Utilities for Golfer 1—Distance and Ball Life

Ball Life
Distance 54 holes 36 holes 18 holes
275 yards  (1)  (2)  (4)
150 125 100
250 yards  (3)  (5)  (7)
110  85  60
225 yards  (6)  (8)  (9)
 50  25  0

The trade-offs that golfer 1 is willing to make between “ball life” and “price” are shown in Exhibit 18.10. This information can be used to estimate a set of utilities for “price” that can be added to those for “ball life” to predict the rankings for golfer 1, as shown in Exhibit 18.11.

EXHIBIT 18.10 Conjoint Rankings of Combinations of Price and Ball Life for Golfer 1

Ball Life
Price 54 holes 36 holes 18 holes
$2.00 1 2 4
$2.50 3 5 7
$3.00 6 8 9

EXHIBIT 18.11 Ranks (in parentheses) and Combined Metric Utilities for Golfer 1—Price and Ball Life

Ball Life
Price 54 holes 36 holes 18 holes
$2.00 (1) (2) (4)
70 45 20
$2.50 (3) (5) (7)
55 30  5
$3.00 (6) (8) (9)
50 25  0

This step produces a complete set of utilities for all levels of the three features or attributes that successfully capture golfer 1’s trade-offs. These utilities are shown in Exhibit 18.12.

EXHIBIT 18.12 Complete Set of Estimated Utilities for Golfer 1

Distance Ball Life Price
Level Utility Level Utility Level Utility
$275 yards 100 54 holes 50 $2.00 20
250 yards  60 36 holes 25 $2.50  5
225 yards  0 18 holes  0 $3.00  0

Simulating Buyer Choice

For various reasons, the firm might be in a position to produce only 2 of the 27 golf balls that are possible with each of the three levels of the three attributes. The possibilities are shown in Exhibit 18.13. If the calculated utilities for golfer 1 are applied to the two golf balls the firm is able to make, then the results are the total utilities shown in Exhibit 18.14. These results indicate that golfer 1 will prefer the ball with the longer life over the one with the greater distance because it has a higher total utility. The analyst need to only repeat this process for a representative sample of golfers to estimate potential market shares for the two balls. In addition, the analysis can be extended to cover other golf ball combinations.

EXHIBIT 18.13 Ball Profiles for Simulation

Attribute Distance Ball Long-Life Ball
Distance  275  250
Life   18   54
Price $2.50 $3.00

EXHIBIT 18.14 Estimated Total Utilities for the Two Sample Profiles

Attribute Distance Ball Long-Life Ball
Level Utility Level Utility
Distance 275 100 250  60
Life 18  0 54  50
Price $2.50  5 $3.00  0
Total utility 105 110

The three steps discussed here—collecting trade-off data, using the data to estimate buyer preference structures, and predicting choice—are the basis of any conjoint analysis application. Although the trade-off matrix approach is simple, useful for explaining conjoint analysis, and effective for problems with small numbers of attributes, it is seldom used in the real world.

One of the most common approaches to conducting conjoint analysis is the use of a discrete choice or choice-based conjoint exercise. Two or more products are shown side-by-side with details provided on each key attribute being tested. Respondents are asked to select a single product from among the options shown. The exercise is repeated multiple times in order to present a wide variety of product designs, but no individual sees more than a fraction of the sometimes thousands or even millions of possible product combinations.

Computer-driven exercises might further adapt the exercise to each respondent, based on prior answers and personal demographics, to spend more time on the factors that seem to be driving product choice. Menu-based conjoint analysis can replicate the choices consumers make when choosing between “value meals” and a la carte items from a restaurant menu. Other computer-driven exercises allow respondents to design their own product with appropriate design constraints and pricing factored in to each option chosen, much the way consumers configure their own Dell computer online or select upgrades for a new car. These and many other approaches can be used to capture the information needed for estimating respondent utilities when designed, executed, and analyzed properly.

As suggested earlier, there is much more to conjoint analysis than has been discussed in this section. However, if you understand this simple example, then you understand the basic concepts that underlie conjoint analysis.

Limitations of Conjoint Analysis

Like many research techniques, conjoint analysis suffers from a certain degree of artificiality. Respondents may be more deliberate in their choice processes in this context than in a real situation. The survey may provide more product information than respondents would get in a real market situation. If key attributes or popular options within key attributes are excluded from the study, demand estimates could be severely impacted. Testing too many attributes or features will diminish the amount of attention that can be given to each individual’s most desired features, reducing measurement precision. The presentation of information (e.g., the order in which attributes are listed; whether pictures are used for some attributes, but not others; how price is displayed; etc.) can greatly impact what features respondents focus on and, ultimately, how they make their decisions. It is important to either be as neutral as possible in the presentation of a conjoint exercise or else try to replicate how the product or service is actually evaluated and compared in the marketplace in order to avoid biasing results.

Finally, it is important to remember that the advertising and promotion of any new product or service can lead to consumer perceptions that are very different from those created via factual descriptions used in a survey. Also keep in mind that consumers can’t purchase something they don’t know exists, so conjoint analysis operates under the assumptions of full awareness, unrestricted access, and complete knowledge of all product features.

Big Data and Hadoop

Big Data is the term used to describe very large and complex data sets. Companies have been collecting transaction-based information since the beginning of the computer age. However, the sheer volume of information has grown exponentially in recent years and the types of information now being generated does not easily fit into traditional hierarchical database structures. Big Data describes the new data capture and management approaches that are designed to handle the higher volume, faster acquisition rates, and broader array of data types. Most of the tools for Big Data are still evolving, and individuals with the skills to capitalize on them are in short supply.

Hadoop is an open-source platform distributed by Apache for managing large amounts of information across hundreds or thousands of networked computers. Each computer works independently on a small portion of the total dataset so that a task such as clustering several billion records can be handled in a fraction of the time taken for more conventional database structures. There are numerous backup copies of each data chunk, so that any failure can be immediately picked up by another computer with access to the same information. Google and Yahoo have had a hand in developing the platform and underlying technology for Hadoop as they sought ways to store and access the vast array of search information they were collecting.

Today, many companies that deal with Big Data—such as Amazon, eBay, Facebook, Google, IBM, LinkedIn, Spotify, Twitter, and Yahoo—use Hadoop to manage their information.

Predictive Analytics15

Predictive analytics describes a wide array of tools and techniques that are used to extract and analyze information from data sets. Statistics, machine learning, database management, and computer programming all play a part in identifying patterns and transforming data into insights. It is an increasingly important set of tools for businesses to transform the exponentially growing quantities of digital data into business intelligence as firms seek informational advantages to improve efficiency and effectiveness. Predictive analytics can apply to Big Data or traditional databases, observational data like loyalty card usage, Internet sources like social media text, and Web tracking data or primary survey research results. Fraud detection, trend analyses, targeted direct marketing, predicting heavy users, and identifying likely buyers are just some of the applications for predictive analytics.

Using Predictive Analytics

Acquiring a Data Set

Before applying predictive analytics, an organization must assemble a target data set relevant to the problem of interest. Predictive analytics can only uncover patterns and relationships that exist in the available data. Typically, the data set must be large enough to include all the patterns and combinations that are likely to be found in the real world.

In the past, assembling such large data sets was very costly and time consuming. Today, most companies capture terabytes of information on their customers as a normal course of business, and many social media companies provide access to massive amounts of data in real time for anyone to tap into. In addition, third-party vendors provide a wide variety of data elements that can be purchased for just about any household or company in the United States.

Pre-processing

Once assembled, the data set must be cleaned in a process where observations that contain excessive noise, errors and missing data are edited or excluded. Data transformations may be used to smooth out irregular distributions and minimize extreme values. Imputing missing values from comparable records and building predictive models to fill-in missing information is often used. Linking multiple data sets is also part of pre-processing available data.

Modeling

A variety of techniques may be employed as part of the modeling process:

  • Clustering.  This is a task of discovering groups and structures in the data that are similar on certain, selected sets of variables. These are groupings that are not obvious and are not based on a single set of variables or small number of items. Clustering normally requires evaluating numerous solutions before finding the best option. Cluster analysis, one of the techniques discussed earlier in this chapter, is commonly used to reveal hidden groupings or identify unexpected associations.
  • Classification.  Readily available information such as demographics and geography might be used to classify individuals on key behaviors such as purchase frequency or product preference. Proprietary information such as online ads viewed or previous products purchased can be very effective at predictive future behaviors whenever such information is available. Customer segments identified through clustering might also be modeled in order to predict which segment new customers and prospects belong. Successful models can be applied to new customers and records that could not be processed directly due to missing data.
  • Estimation.  Calculations such as risk scores, fraud detection, retention rates, lifetime value, and likelihood to purchase rates may be calculated for individuals or groups. These calculations can be used to predict future outcomes based on limited present-day data. They can also be used to monitor individuals or groups in order to detect changes in behavior that allow the organization to react before customers or revenues are lost.

Validating Results

A final step of knowledge discovery from the target data and modeling is to attempt to verify the patterns produced by the predictive modeling algorithms in a wider data set. Not every pattern and relationship identified in previous steps turns out to be valid in the real world. In the evaluation process, the patterns or models identified in the wider data set are applied to a test data set that was not used to develop the predictive modeling algorithm. The resulting output is compared to the desired output.

For example, an algorithm developed to predict those most likely to respond to a mail offer would be developed or trained on certain past mail offers. Once developed or trained, the algorithm developed from the test mailings would be applied to other mailings not used in the development of the algorithm or to actual results from a mailing recently completed. If the predictive model does not meet the desired accuracy standards, then it is necessary to go through the previous steps again in order to develop an algorithm or model with the desired level of accuracy.

Applying the Results

Once the models and calculations are in place and have been validated, they are applied to existing and future customer records to improve the efficiency and effectiveness of marketing efforts. For example, specific information captured from a new sales inquiry can be used to classify an individual into the correct market segment. Based on their market segment, the most appropriate product offering can be prepared and the marketing messages can be adjusted to most resonate with that individual. Purchasing prospect lists with specific information appended to each record allows an organization to avoid wasting marketing dollars on unlikely purchasers (based on applied predictive models) and focus resources on the most likely buyers and those with the greatest potential lifetime value.

The Practicing Marketing Research feature below provides on example of how predictive modeling is used by a major retailer and also touches on the privacy issues discussed in the next section.

Duhigg suggests that Target’s gangbusters revenue growth in the 2000s—$44 billion in 2002, when Pole was hired, to $67 billion in 2010—is attributable to Pole’s helping the retail giant corner the baby-on-board market, citing company president Gregg Steinhafel boasting to investors about the company’s “heightened focus on items and categories that appeal to specific guest segments such as mom and baby.”

Privacy Concerns and Ethics

Most believe that predictive modeling is ethically neutral. However, the ways in which data are collected for predictive modeling and the types of data acquired can raise questions regarding privacy, legality, and ethics. For example, monitoring telephone calls and Internet usage for national security or law enforcement purposes has raised privacy concerns.

Commercial Predictive Modeling Software and Applications

Database providers such as Oracle and Microsoft provide tools optimized for their platform. Popular Big Data platform, Hadoop, has a variety of open-source and commercial tools available. There are an increasing number of highly integrated packages for predictive modeling, including:

  • Angoss KnowledgeSTUDIO
  • Clarabridge
  • RapidMiner
  • SAS Enterprise Miner
  • SPSS Modeler
  • STATISTICA Data Miner

Summary

Multivariate analysis refers to the simultaneous analysis of multiple measurements on each individual or object being studied. Some of the more popular multivariate techniques include multiple regression analysis, multiple discriminant analysis, cluster analysis, factor analysis, and conjoint analysis.

Multiple regression analysis enables the researcher to predict the magnitude of a dependent variable based on the levels of more than one independent variable. Multiple regression fits a plane to observations in a multidimensional space. One statistic that results from multiple regression analysis is called the coefficient of determination, or R2. The value of this statistic ranges from 0 to 1. It provides a measure of the percentage of the variation in the dependent variable that is explained by variation in the independent variables. The b values, or regression coefficients, indicate the effect of the individual independent variables on the dependent variable.

Whereas multiple regression analysis requires that the dependent variable be metric, multiple discriminant analysis uses a dependent variable that is nominal or categorical in nature. Discriminant analysis can be used to determine if statistically significant differences exist between the average discriminant score profiles of two (or more) groups. The technique can also be used to establish a model for classifying individuals or objects into groups on the basis of their scores on the independent variables. Finally, discriminant analysis can be used to determine how much of the difference in the average score profiles of the groups is accounted for by each independent variable. The discriminant score, called a Z score, is derived for each individual or object by means of the discriminant equation.

Cluster analysis enables a researcher to identify subgroups of individuals or objects that are homogeneous within the subgroup, yet different from other subgroups. Cluster analysis requires that all independent variables be metric, but there is no specification of a dependent variable. Cluster analysis is an excellent means for operationalizing the concept of market segmentation.

The purpose of factor analysis is to simplify massive amounts of data. The objective is to summarize the information contained in a large number of metric measures such as rating scales with a smaller number of summary measures called factors. As in cluster analysis, there is no dependent variable in factor analysis. Factor analysis produces factors, each of which is a weighted composite of a set of related variables. Each measure is weighted according to how much it contributes to the variation of each factor. Factor loadings are determined by calculating the correlation coefficient between factor scores and the original input variables. By examining which variables load heavily on a given factor, the researcher can subjectively name that factor.

Perceptual maps can be produced by means of factor analysis, multidimensional scaling, discriminant analysis, or correspondence analysis. The maps provide a visual representation of how brands, products, companies, and other objects are perceived relative to each other on key features such as quality and value. All the approaches require, as input, consumer evaluations or ratings of the objects in question on some set of key characteristics.

Conjoint analysis is a technique that can be used to measure the trade-offs potential buyers make on the basis of the features of each product or service available to them. The technique permits the researcher to determine the relative value of each level of each feature. These estimated values are called utilities and can be used as a basis for simulating consumer choice.

Predictive modeling draws on statistics, machine learning, artificial intelligence, and computer programming to identify patterns in market data sets. It is becoming increasingly important as the available data grow exponentially.

Key Terms

causation

classification matrix

cluster analysis

coefficient of determination

collinearity

conjoint analysis

discriminant coefficient

discriminant score

dummy variables

factor

factor analysis

factor loading

K-means cluster analysis

metric scale

multiple discriminant analysis

multiple regression analysis

multivariate analysis

nominal or categorical

regression coefficients

scaling of coefficients

utilities

Questions for Review & Critical Thinking

  1. Distinguish between multiple discriminant analysis and cluster analysis. Give several examples of situations in which each might be used.
  2. What purpose does multiple regression analysis serve? Give an example of how it might be used in marketing research. How is the strength of multiple regression measures of association determined?
  3. What is a dummy variable? Give an example using a dummy variable.
  4. Describe the potential problem of collinearity in multiple regression. How might a researcher test for collinearity? If collinearity is a problem, what should the researcher do?
  5. A sales manager examined age data, education level, a personality factor that indicated level of introvertedness/extrovertedness, and level of sales attained by the company’s 120-person salesforce. The technique used was multiple regression analysis. After analyzing the data, the sales manager said, “It is apparent to me that the higher the level of education and the greater the degree of extrovertedness a salesperson has, the higher will be an individual’s level of sales. In other words, a good education and being extroverted cause a person to sell more.” Would you agree or disagree with the sales manager’s conclusions? Why?
  6. The factors produced and the results of the factor loadings from factor analysis are mathematical constructs. It is the task of the researcher to make sense out of these factors. The following table lists four factors produced from a study of cable TV viewers. What label would you put on each of these four factors? Why?
    Factor Loading
    Factor 1I don’t like the way cable TV movie channels repeat the movies over and over. .79
    The movie channels on cable need to spread their movies out (longer times between repeats). .75
    I think the cable movie channels just run the same things over and over and over. .73
    After a while, you’ve seen all the pay movies, so why keep cable service. .53
    Factor 2I love to watch love stories. .76
    I like a TV show that is sensitive and emotional. .73
    Sometimes I cry when I watch movies on TV. .65
    I like to watch “made for TV” movies. .54
    Factor 3I like the religious programs on TV (negative correlation).−.76
    I don’t think TV evangelism is good. .75
    I do not like religious programs. .61
    Factor 4I would rather watch movies at home than go to the movies. .63
    I like cable because you don’t have to go out to see the movies. .55
    I prefer cable TV movies because movie theaters are too expensive. .46
  7. The following table is a discriminant analysis that examines responses to various attitudinal questions from cable TV users, former cable TV users, and people who have never used cable TV. Looking at the various discriminant weights, what can you say about each of the three groups?
    Discriminant Weights
    UsersFormersNevers
    Users
    A19Easygoing on repairs−.40
    A18No repair service−.34
    A7Breakdown complainers+.30
    A5Too many choices−.27
    A13Antisports−.24
    A10Antireligious+.17
    Formers
    A4Burned out on repeats+.22
    A18No repair service+.19
    H12Card/board game player+.18
    H1High-brow−.18
    H3Party hog+.15
    A9DVD preference+.16
    Nevers
    A7Breakdown complainer−.29
    A19Easygoing on repairs+.26
    A5Too many choices+.23
    A13Antisports+.21
    A10Antireligious−.19
  8. The following table shows regression coefficients for two dependent variables. The first dependent variable is willingness to spend money for cable TV. The independent variables are responses to attitudinal statements. The second dependent variable is stated desire never to allow cable TV in their homes. By examining the regression coefficients, what can you say about persons willing to spend money for cable TV and those who will not allow cable TV in their homes?
  9. Explain what predictive analytics encompasses. Provide examples of some marketing problems to which you might apply predictive analytics.
  10. Describe the steps in the predictive analytics process.
  11. What is Hadoop? How does it relate to Big Data?
    Regression Coefficients
    Willing to Spend Money for Cable TV
    Easygoing on cable repairs−3.04
    Cable movie watcher 2.81
    Comedy watcher 2.73
    Early to bed−2.62
    Breakdown complainer 2.25
    Lovelorn 2.18
    Burned out on repeats−2.06
    Never Allow Cable TV in Home
    Antisports 0.37
    Object to sex 0.47
    Too many choices 0.88

Working the Net

  1. A good discussion of cluster analysis can be found at http://faculty.darden.virginia.edu/GBUS8630/doc/M-0748.pdf.
  2. For some easy to digest and comprehensive information on multivariate analysis, including how to run these analyses in SPSS, visit http://core.ecu.edu/psyc/wuenschk/spss/SPSS-MV.htm.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset