CHAPTER EIGHT

Segmentation for Retailers

SEGMENTATION IN THE RETAIL INDUSTRY

Retail enterprises offer a large variety of products through different channels to customers with diverse needs. The lack of a formal commitment and the ease with which shoppers can prefer competitors make the process of building a loyal customer base tougher for retailers. Quality of commodities and competitive pricing are often not enough to stand out from the competition.

In such a dynamic environment a competitive enterprise should try to understand its customers by gaining insight into their needs, attitudes, and behaviors. Ideally each customer should be treated as an individual and the enterprise should operate on a one-to-one basis. Since this approach is obviously not possible, an efficient alternative is to segment the customers into groups with different characteristics and develop differentiated strategies that best address their specific features.

As mentioned in a previous chapter, different segmentation schemes can be developed according to the specific business objectives of the organization.

Needs/attitudinal segmentation is commonly employed through market research data in the retail industry to gain insight into the customer attitudes, wants, views, preferences, and opinions about the enterprise and the competition. In addition to external/market research data, transactional data can also be used for the development of effective segmentation solutions. A value-based segmentation scheme allocates customers to groups according to their spending amount. It can be used to identify high-value customers and to prioritize their handling according to their measured importance.

Behavioral segmentation is also based on transactional data and separates customers according to attributes that summarize their shopping habits, such as:

  • Frequency and recency of purchases
  • Total spending amount
  • Relative spending amount per product group/subgroup
  • Size of basket (spending amount and number of items per visit or transaction)
  • Preferred payment method
  • Preferred period/day/time of purchases
  • Preferred store and channel, and so on.

The derived segments can be used for the “personalized” handling of segmented customers through the development of differentiated sales and marketing strategies, tailored to their recognized consuming habits.

Transactional data are logged at the point of sale and typically record the detailed information of every transaction, including the universal product code (UPC) of each purchased item, which allows detailed monitoring of the groups and subgroups of products that each customer tends to buy. A prerequisite of behavioral segmentation is that every transaction is identified with a customer. This issue is usually tackled by introducing a loyalty program which assigns an identification field (card ID) to each transaction and permits the tracking of the purchase history of each customer and aggregation of the transactional information at a customer level.

In this chapter we focus on the efforts of a retailer to segment its customers according to their consuming habits and more specifically according to the product mix they buy. A high-level grouping of products was selected for this first segmentation attempt. The relevant data were readily available within the organization’s mining data mart and MCIF which stored all the processed transactional information. In addition, the marketers of the organization also decided to employ a recency, frequency, monetary (RFM) analysis to examine and group their customers according to their purchase frequency, recency, and value. These applications are described in the following sections.

THE RFM ANALYSIS

RFM analysis is a common approach for understanding customer purchase behavior. It is quite popular, especially in the retail industry. As its name implies, it involves the calculation and the examination of three KPIs – recency, frequency, and monetary – that summarize the corresponding dimensions of the customer relationship with the organization. The recency measurement indicates the time since the last purchase transaction of the customer. Frequency denotes the number and rate of purchase transactions. Monetary indicates the value of the purchases. These indicators are typically calculated at a customer (card ID) level through simple data processing of the available transactional data.

RFM analysis can be used to identify good customers with the best scores in the relevant KPIs, who generally tend to be good prospects for additional purchases. It can also identify other purchasing patterns and respective customer types of interest, such as infrequent big-spenders or customers with small but frequent purchases who might also have sales perspectives, depending on the market and the specific product promoted.

In the retail industry, the RFM dimensions are usually defined as follows:

  • Recency: The time (in units such as days/months/years) since the most recent purchase transaction or shopping visit.
  • Frequency: The total number of purchase transactions or shopping visits in the period examined. An alternative, and probably a better defined, approach that also takes into account the tenure of the customer calculates frequency as the average number of transactions per unit of time, for instance the monthly average number of transactions.
  • Monetary: The total value of the purchases within the period examined or the average value (e.g., monthly average value) per time unit. According to an alternative, but not so popular, definition, the monetary indicator is defined as the average transaction value (average value per purchase transaction). Since the total value tends to be correlated with the frequency of the transactions, the reasoning behind this alternative definition is to capture a different and supplementary aspect of purchase behavior.

The construction of the RFM indicators is a simple data management task which does not involve any data mining modeling. It does, however, involve a series of aggregations and simple computations that transform the raw purchase records into meaningful scores. In order to perform RFM analysis each transaction should be linked with a specific customer (card ID) so that the customer’s purchase history can be tracked and investigated over time. Fortunately, in most situations, the use of a loyalty program makes the collection of “personalized” transactional data possible.

RFM components should be calculated on a regular basis and stored along with the other behavioral indicators in the organization’s mining data mart and MCIF table. They can be used as individual fields in subsequent tasks, for instance as inputs, along with other predictors, in upcoming supervised cross-selling models. They can also be included as clustering fields for the development of a multiattribute behavioral segmentation scheme. Usually they are simply combined to form a single RFM measure and a respective cell-based segmentation scheme.

Typically, RFM analysis involves the grouping (binning) of customers into chunks of equal size, or quantiles, in a way similar to the one presented for valuebased segmentation. This binning procedure is applied independently to the three RFM component measures. Customers are sorted according to the respective measure and are grouped in classes of equal size. For instance, the breakdown into four groups results in quartiles of 25%, and the breakdown into five groups, in quintiles of 20%. As a consequence the RFM measures are transformed into ordinal scores. In the case of binning into quintiles, for example, the RFM measures are converted into rank scores ranging from 1 to 5. Group 1 includes the 20% of customers with the lowest values and group 5 the 20% of customers with the top values in the corresponding measure. Especially for the recency measure, the scale of the derived ordinal score should be reversed so that larger scores represent the most recent buyers.

The derived R, F, and M bins become the components for the RFM cell assignment. These bins are combined with a simple concatenation to provide the cell assignment. Customers with the top RFM values and quintile values of 5 are assigned to cell 555. Similarly, customers with the average recency (quintile 3), top frequency (quintile 5), and lowest monetary values (quintile 1) form cell 351, and so on.

This procedure for constructing the RFM cells is illustrated in Figure 8.1.

Figure 8.1 Assignment to the RFM cells.

c08_image001.jpg

Figure 8.2 The total RFM cells in the case of binning into quintiles (groups of 20%).

c08_image002.jpg

When grouping customers in quintiles (groups of 20%), the procedure results in a total of 5 × 5 × 5 = 125 RFM cells as displayed in Figure 8.2.

This combination of the R, F, and M components into cells is widely used, though it does have a certain disadvantage. The large number of derived cells makes the procedure quite cumbersome and hard to manage. An alternative method for segmenting customers according to their RFM patterns is to use the respective components as inputs in a clustering model and let the algorithm reveal the underlying natural groupings of customers.

The marketers of the retail enterprise decided to perform RFM analysis, before proceeding to the development of a more general multi-attribute segmentation scheme. The procedure followed is presented in “The RFM Segmentation Procedure”.

Combining R, F, and M Components to Derive a Continuous RFM Score

An alternative approach treats the binned R, F, and M components as continuous measures. According to this approach, the R, F, and M bins are summed, with appropriate user-defined weights, in order to provide a continuous RFM score. The RFM score is the weighted average of its individual components and is calculated as follows:

c08_image003.jpg

The weights assigned to each RFM component designate its significance and can be specified by the analysts according to prior knowledge of the particular industry and enterprise.

As an example let us consider the case of the customer previously assigned to RFM cell 351. With equal weights of 10.0 in all the RFM individual components, this customer would receive a score of 90, in a scale ranging from 30 to 150. These scores can be rescaled to the 0–1 range according to the following formula:

c08_image003.jpg

Unlike the RFM cells, the continuous form of the RFM unifies the information in the relevant components without preserving their distinct information. Nevertheless, this can prove to be simpler and more convenient since marketers will have to monitor just a single scale measurement instead of deciphering a rather complex combination of numbers.

THE RFM SEGMENTATION PROCEDURE

Since the goal was to investigate current and not past behaviors, the RFM analysis was based on purchase records of the last six months. The procedure involved the calculation of the RFM components and the assignment of customers to RFM segments. The implementation steps for the calculation of the RFM components and cells are briefly as follows:

1. Data acquisition: Transactional data of the last six months were retrieved, audited, cleaned, and prepared for subsequent operations. The organization’s loyalty program made it possible to link customers to specific transactions and to track each customer’s purchase trail.

2. Selection of the population to be segmented: Only active customers were included in the RFM segmentation. Inactive customers with no purchases during the last six months were identified and a relevant list was constructed and sent to the marketing department for inclusion in upcoming retention and reactivation campaigns.

Furthermore, new customers with a relationship with the retail enterprise of less than three months were also excluded from the RFM analysis since it was considered that their relationship with the enterprise was relatively short and the available data were not sufficient to outline a reliable purchase profile.

Finally, after preliminary data analysis and discussions with the marketers of the organization, customers with total purchases lower than a specific threshold value were considered as dormant and were also excluded from the subsequent segmentation.

3. Data preparation and computation of the R, F, and M measurements: Transactional data were aggregated (grouped by) at a customer (card ID) level.

A two-fold aggregation procedure was followed. Initially, records were grouped by card ID and transaction ID. Then they were further grouped by card ID as shown in the IBM SPSS Modeler Aggregate screenshot in Figure 8.3.

The information summarized for each customer included:

(a) Date of the latest (maximum in the case of a date or timestamp field) purchase transaction. This information was then used to derive recency as the number of days since the most recent purchase transaction. The IBM SPSS Modeler Derive node and a date function were used to return the number of days from the last transaction to the current date (represented by IBM SPSS Modeler’s “@TODAY” function), as displayed in the screenshot in Figure 8.4.

Figure 8.3 The IBM SPSS Modeler Aggregate node for summarizing purchase data at a customer level.

c08_image004.jpg

(b) Monthly average number of distinct purchase transactions. This information defined the frequency component of the RFM analysis. As shown in Figure 8.5, a conditional derive node in IBM SPSS Modeler was used to divide the total number of transactions by the appropriate number of months: six months for old customers and the number of months as registered customers (tenure) for new customers.

Figure 8.4 Deriving the recency component of the RFM score with a date function in IBM SPSS Modeler.

c08_image005.jpg

Figure 8.5 Deriving the frequency component with a conditional derive node in IBM SPSS Modeler.

c08_image005.jpg

(c) Monthly average amount spent defined the monetary component. This component was calculated with a formula similar to the one used for the frequency measure.

The IBM SPSS Modeler (Formerly Clementine) RFM Aggregate Node

IBM SPSS Modeler also includes a data preparation tool, named the “RFM Aggregate” node (Figure 8.6), that can simplify the computation of the individual RFM components. Users only have to designate the required source fields, specifically the customer ID (card ID) field for the aggregation and the fields indicating each transaction’s value and date.

Figure 8.6 The IBM SPSS Modeler RFM Aggregate node.

c08_image005.jpg

4. Development of the RFM cells through binning: Customers were sorted independently according to each of the individual RFM components and then binned into five groups of 20%. The resulting bins (quintiles) were then combined (concatenated) to form the RFM cell assignment.

The IBM SPSS Modeler (Formerly Clementine) RFM Analysis node

IBM SPSS Modeler also offers a tool, named the “RFM Analysis” node, that can directly group the R, F, and M measures into the selected number of quantiles. Users can then collate the derived ordinal scores to form the corresponding cell-based segmentation. Additionally, this node also sums the individual components by using user-defined weights to produce a continuous RFM score. The IBM SPSS Modeler “RFM Analysis” node is illustrated in Figure 8.7.

Figure 8.7 The IBM SPSS Modeler RFM Analysis node.

c08_image005.jpg

5. Development of the RFM segments through clustering: The marketers of the enterprise decided to also apply a clustering model to analyze the RFM components. They also decided to enrich the segmentation criteria with information concerning the purchases of private label products and the average basket value. This value denotes the average amount spent on each transaction. Private label products are goods which are only sold in the retailer’s stores under the retailer’s brand name and are often positioned as low-cost alternatives to “named” brands. The retailer was interested in investigating the favoring of these special products. Therefore the percentage (%) of the total purchase amount accounted for by private label products was included in the clustering fields along with the RFM individual components and the basket value. The clustering model identified six RFM segments, which are listed in Table 8.1 along with their identified characteristics.

The majority of customers with average RFM patterns were assigned to the “Typical” segment. The “Dormant” segment contained customers on the verge of inactivity with very low purchase rates and the worst RFM profile. At the other end stood the “Superstars” and the “Golden customers”. These were high-value customers with increased frequency of transactions. “Superstars“ in particular seemed to be the most loyal ones, with an increased number of visits/transactions and a high preference for private label products. “Everyday shoppers” made frequent but low-value transactions, probably to cover their daily needs. They also showed increased preference for private label brands. Occasional customers on the other hand made infrequent visits to the store branches but of high average value.

Table 8.1 The RFM segments revealed through clustering.

Superstars • The most loyal customers
• Highest value
• Highest frequency
• High spending on private labels
Golden customers • Second highest value
• High frequency
• Average spending on private labels
Typical customers • Average value and frequency
• Average spending on private labels
“Exceptional occasions” customers • The second lowest frequency after “Dormant customers”
• Large basket
• Low recency values (long time since their last visit)
“Everyday” shoppers • Increased frequency of transactions
• Small basket
• Private labels
• Medium to low value
Dormant customers • Lowest frequency and value
• Long time since their last visit (lowest recency values)

A deployment procedure was also developed to support future updating of the segments. The clustering model was supplemented with a classification model, a decision tree in particular, which identified the input patterns associated with each revealed RFM segment. The tree rules were saved for the segment assignment of new records. The deployment plan also included a periodic cohort analysis, a type of “before–after” examination of the customer base, with simple reports that could identify the migrations of customers across segments over time.

RFM: BENEFITS, USAGE, AND LIMITATIONS

The individual RFM components and the derived segments convey useful information with respect to the purchasing habits of consumers. Undoubtedly, any retail enterprise should monitor the purchase frequency, intensity, and recency as they represent significant dimensions of the customer’s relationship with the enterprise.

Moreover, by following RFM transitions over time an organization can keep track of changes in the purchasing habits of each customer and use this information to proactively trigger appropriate marketing actions. For instance, specific events, like the decline in the total value of purchases, a sudden drop in the frequency of visits, or no-shows for an unusually long period of time, may indicate the beginning of the end of the relationship with the organization. These signals, if recognized in time, should initiate event-triggered retention and reactivation campaigns.

RFM analysis was originally developed for retailers, but with proper modifications it can also be applied in other industries. It originated from the catalogue industry in the 1980s and proved quite useful in targeting the right customers in direct marketing campaigns. The response rates of the RFM cells in past campaigns were recorded and the best-performing cells were targeted in the next campaigns. An obvious drawback of this approach is that it usually ends up with almost the same target list of good customers, who could become annoyed with repeated contacts. Although useful, the RFM approach, when not combined with other important customer attributes such as product preferences, fails to provide a complete understanding of customer behavior. An enterprise should have a complete view of the customer and use all the available information to guide its business decisions.

These limitations led the marketers of the organization to develop an additional segmentation scheme that would separate customers with respect to the products that they tend to buy. Their purchases by product category were analyzed and the customers were grouped accordingly. This procedure is outlined in the next section. The revealed segments also reflected the lifecycle stage of the customers and provided valuable information for the development of tailored marketing activities.

GROUPING CUSTOMERS ACCORDING TO THE PRODUCTS THEY BUY

The next step was to reveal the customer types with respect to the mix of products that they tend to buy. The objective was to use the results to optimize and adapt the offers, rewards, and incentives received by customers according to their identified needs and preferences. The segmentation process involved the application of a clustering model to the purchase records. Relevant data from the last six months were aggregated at a customer (card ID) level. A high hierarchy level in the existing product taxonomy was chosen for grouping the product codes, as follows:

  • Apparel/shoes/jewelry
  • Baby
  • Electronics
  • Computers
  • Food and wine
  • Health and beauty
  • Pharmacy
  • Sports and outdoors
  • Books/press
  • Music/movies/videogames
  • Toys
  • Home.

The segmentation fields summarized the relative spending (percentage of total spending) of each active customer in the above product categories. Demographic data, including age, gender, and marital status of the customers, were not included in the model training; however, they contributed to the profiling of the clusters generated. The revealed segments are presented in Table 8.2 along with a brief behavioral and demographic profile.

Customers were assigned to six groups. Although the segmentation criteria only involved purchasing preferences, the demographic profile of the clusters also revealed a clear separation between age and marital status. This finding confirmed the initial belief that the product mix would be strongly associated with the family lifecycle stage of the customers.

Table 8.2 Customer segments with respect to the product mix.

Segment Purchase patterns Demographics
1. Hobby shoppers – single men Electronics, computers, movies/music/games, sports and outdoors Young men, single
2. Fashion shoppers – single women Apparel/shoes/jewelry, beauty, books/press, movies/music/games Young women, single
3. Average shoppers Home, food and wine, apparel/shoes/jewelry, electronics, movies/music/games, health and beauty Young, married, no children
4. Family shoppers – full nest I Food and wine, home, baby, toys Young, married, children under 5
5. Family shoppers with children – full nest II Food and wine, home, toys, books/press, movies/music/games Middle aged, married with children aged 5 or over
6. Older families/retired Home, food and wine, pharmacy, books/press Older, no dependents

The first cluster was dominated by single young men. “Hobby shoppers” were characterized by increased relative spending on recreation products such as electronics, computers, movies/music/games, and sports and outdoors products, revealing a behavior typical of lively young men with free time and many leisure activities. “Fashion shoppers,” the female counterpart of the first cluster, showed a similar purchasing profile, with an increased preference for clothes, accessories, and beauty products.

The segment of “Average shoppers” has diverse preferences and spending on many product categories, including grocery, home, and leisure products. Young, married customers formed the majority in this cluster. It seems that customers at this life-stage start to spend on their home and family; however, they still spend money on themselves and their hobbies.

Different needs and new priorities arise with the birth of a baby. Customers with high spending on baby products formed a group of their own, the segment of “Family shoppers,” mainly consisting of young married people with a baby. As babies grow into children, the purchase baskets change once again. Grocery and home products can still be found in the family’s basket but the baby products are replaced by toys, books, movies/music/games; that is, leisure time products, but for the children this time, not for the parents. This purchasing profile characterized the segment of “Family shoppers with children,” a group mainly composed of middle-aged married customers. The last cluster included customers with low total purchases, mainly for home, food, and pharmacy products. It was labeled as “Older families/retired” since it turned out to mostly contain older people with no dependents.

Once the product-based segments had been revealed and their differentiating characteristics recognized, it was time for the marketers at the retailer to act on them and use them to customize the loyalty program’s offerings. From that point on, customers were presented with rewards, offers, and incentives that matched their specific profile. The reward type was determined by the identified customer profile, but its value depended on the RFM segment of the customer.

Finally, the enterprise also decided to differentiate its communication strategy, according to the identified customer typologies. An initiative in this direction included the replacement of general interest brochures and newsletters with specialized but more detailed ones which better addressed the specific requirements of each revealed segment.

SUMMARY

In this chapter we presented a segmentation example from the retail industry. The retailer’s loyalty program made possible the tracking of each customer’s purchases over time. Purchase data of the last six months were mined, revealing the purchasing preferences of the customers and enabling their product-based segmentation. A clustering model revealed six segments, clearly differentiated in terms of their consuming preferences. The purchasing patterns identified also reflected the different family lifecycle stages of the customers, their differentiated needs, wants, and priorities. Moreover, this allowed the marketers of the organization to customize their loyalty program’s offers and rewards and the communication strategies of the enterprise, according to the distinct profile of each segment.

Customers were also segmented according to their purchase recency, frequency, and money. This RFM segmentation was also taken into account in prioritizing the customer handling according to the importance of each customer.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset