In relation to the first reason, we can consider studies about customers behavior and, particularly, buying habits. I should have a spreadsheet somewhere on my desktop which looks similar to the following table:
Customer | June | July | August | September |
1 | 200 | 150 | 190 | 195 |
2 | 1050 | 1101 | 975 | 1095 |
3 | 1300 | 1130 | 1340 | 1315 |
4 | 400 | 410 | 395 | 360 |
5 | 450 | 400 | 378 | 375 |
6 | 1125 | 1050 | 1125 | 1115 |
7 | 1090 | 1070 | 1110 | 1190 |
8 | 980 | 990 | 1200 | 1001 |
9 | 600 | 540 | 330 | 220 |
10 | 1130 | 1290 | 1170 |
1310 |
From this table, you can compute the following averages by month:
June | July | August | September | |
average | 832.5 | 813.1 | 821.3 | 817.6 |
These averages would probably lead you to the conclusion that no big change is occurring within your sales figures. But, if we add one more attribute, which is the cluster of the age of the customer, we can gain a better understanding of our numbers:
Customer | Age cluster | June | July | August | September |
1 | young | 200 | 150 | 190 | 195 |
2 | adult | 1050 | 1101 | 975 | 1095 |
3 | adult | 1300 | 1130 | 1340 | 1315 |
4 | young | 400 | 410 | 395 | 360 |
5 | young | 450 | 400 | 378 | 375 |
6 | adult | 1125 | 1050 | 1125 | 1115 |
7 | elder | 1090 | 1070 | 1110 | 1190 |
8 | elder | 980 | 990 | 1200 | 1001 |
9 | young | 600 | 540 | 330 | 220 |
10 | elder | 1130 | 1290 | 1170 | 1310 |
I am sure you are already spotting some differences between Age clusters, but let's try to summarize these differences by computing the total sales by category:
Age cluster | June | July | August | September |
young | 1650 | 1500 | 1293 | 1150 |
adult | 3475 | 3281 | 3440 | 3525 |
elder | 3200 | 3350 | 3480 | 3501 |
Can you see it? Behind a substantially stable average, three completely different stories were hiding: the company is losing ground in the young field, is remaining stable within the adult cluster, and is gaining heavily with the elder segment. This is a simple and yet useful example to understand why the average, while being a synthetic way to describe an entire population, can also be unable to properly describe that population. If we had computed, for instance, the interquartile range of this population for the same four months, we would have found the following numbers :
June | July | August | September | |
IQR | 628,75 | 650,75 | 776,5 | 807,5 |
This would have shown us, in a clear way, that our population was diverging in some way, and becoming less homogeneous.