12.3. Traffic Bursts

There are two obvious questions that people like to ask about hurricanes: "Is it coming my way?" and "How strong is it?" In electronic business, similar questions arise very often: "Is the site going to receive a burst of hits?" and "How strong will the burst be?" Existing techniques and surveys can be used to answer the first question. But online companies need more information. They need to know how much computing and network capacity needs to be provisioned to support a traffic spike of a given intensity.

Web traffic is quite bursty [4]. Figure 12.2 shows the daily traffic to a real online retailer store. The upper part of the figure shows the hourly volume of hits for a typical week. One observation is that the number of hits to the store is low during the weekend (i.e., days 1 and 7) and surges in traffic occur around 2:00 P.M. A conjecture that could explain this observation is that people spend more time surfing the Web at work than at home because companies have faster connections to the Internet. The graph in the lower part of the figure depicts the number of hits measured in each five-minute interval for the same week. Changing the time scale does not modify the burstiness exhibited by the traffic to the e-tailer store. It is even burstier.

Figure 12.2. Traffic Volume to an E-Tailer Site.


Better forecasting and site preparation can significantly reduce the amount of burst-induced damage to site performance. Although it is difficult to predict erratic usage demands, sites must be prepared for them. An e-business site architecture should be flexible and scalable to support demand spikes. In many cases, it is possible to identify the roots of sudden changes in customer demand and it is also possible to categorize the phenomena that drive bursts of traffic to a site. Once the origins of traffic surges have been identified, it is possible to devise better strategies to handle these surges.

Unpredictable News Events. Web users follow breaking news stories and so does traffic. In the quest for the latest information, Web users may overwhelm news sites. Every time a big event happens (i.e., political scandals, accidents, stock market crashes, and wars), traffic peaks at news and TV sites. For instance, during a stock market crisis when the Dow Jones Industrial Average dove to a very low value, a record volume of shares were traded. Online trading businesses were flooded by visitors seeking quotes and placing trades, creating peaks seven to ten times the average normal trading volumes. These types of events are hard to predict in advance.

Predictable News Events. Surges at specific sites occur because of predictable news events. Natural disasters, such as hurricanes, earthquakes, and storms attract huge numbers of people seeking information from specialized sites. For instance, during the days that followed the announcement of the arrival of a hurricane, the weather sites were clogged with users trying to obtain updates. "We got in just one hour what we typically get in a seven-day period," said the spokesperson for an online weather service. Existing models can predict natural disasters, such as hurricanes and storms, within a few days of their occurrence. Thus, weather-related sites can take advantage of the prediction of some events and prepare for the bursts.

Product or Service Announcement. "We received far more accesses than we had imagined on the Web," said the VP of a software company to explain the performance problems faced by customers who tried to purchase and download the company's popular software. The servers became saturated right after the company started taking orders for the software. An e-business can set the date of a product announcement but cannot know in advance how successful the product will be. For example, the site of an online tax preparation service was shut down in its first week of operation. The company expected 500,000 customers in a period of twelve weeks. However, the site received 220,000 customers in the first week and a surge of customers overwhelmed the system and brought it down. Because a site can be extremely popular during the launch period and dead cold in a month, flexibility in infrastructure should be planned so that resources can be added or removed from the site.

Special Events. Christmas, Valentine's Day, and Thanksgiving always boost e-commerce traffic. This increase in the number of online shoppers is clearly predictable several months in advance. The question is whether the intensity of the bursts can be forecast. Accurate forecasts can be quite valuable when e-business sites are faced with sudden growths in demands. For example, on a Super Bowl Sunday, a dotcom company got 500,000 unique visitors in ten hours after its commercial aired. Although the Super Bowl date, the TV ad, and the expected number of visitors are known in advance, sites are commonly unprepared for post-Super Bowl traffic. In other words, some e-businesses fail to prepare for the traffic generated by their own advertising campaigns.

12.3.1. High Variability

We saw that e-business traffic exhibits a bursty behavior. Bursts refer to the random arrival of requests, with peak rates exceeding the average rates by factors of eight to ten [8], as can be noticed from Fig. 12.2. A practical consequence of burstiness is e-business site management's difficulty in planning the site capacity to support the demand created by load spikes. Spikes can be characterized by the peak traffic ratio, defined as the ratio between peak site and average site traffic. In e-business sites, the peak traffic ratio varies according to the nature of the business and can easily reach values up to twenty times the average.

The heavy-tailed distributions and high variability that exist on the Web can also be found in e-business. A study of actual e-commerce logs shows that the algorithm used to identify customer sessions found a significant number of very short sessions and a very small number of long sessions, out of the 628,573 requests analyzed [7]. This is additional evidence of heavy-tailed distributions. The long sessions found in [7] were generated by robot accesses. The graphs in Fig. 12.2 confirm the high variability of the rates for hits/hour and hits/minute.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset