Performance budgets

Another way you can improve the UX of a service is by defining a performance budget. We talked about performance budgets very briefly in Chapter 5, Testing and Releasing. The concept is that for every code change, you measure its effect on system performance. If performance changes enough to take the service out of valid bounds, you need to stop working on features until you modify the application to get performance back to normal levels.

In Chapter 6, Capacity Planning, we said that tying goals to performance is dangerous. We said that this is because making changes and knowing the performance impact is hard. Instead, by measuring performance constantly and seeing how changes made for features affect system-wide performance, your team can make a cost-benefit analysis to determine how much effort is needed to meet your performance goals.

Performance budgets were first introduced to me by Lara Hogan. Lara is a very distinguished engineering manager, the author of Designing for Performance and also a prolific public speaker. The idea is that you define a maximum allowed page load speed under certain conditions, such as WebPagetest using Dulles location in Chrome on 3G. This defines how speed is measured (WebPagetest is an online service for evaluating page loads) and under what conditions the test is run. If you're thinking that this sounds like an SLO, you are right! It is a metric we can evaluate and use to inform our decisions. Lara even suggests that you add the SLO check to your tests, so that if you break the requirement, you cannot deploy.

The reason that page speed is so important in web UX is multifaceted. First off, many search engines, including Google, use page speed as one metric to evaluate a page's ranking. The slower the page is, the lower the ranking. Secondly, users hate slow pages (https://perspectives.mvdirona.com/2009/10/the-cost-of-latency/). In 2006, Marissa Mayer, the former Google and Yahoo executive, when talking about Google, said, "Half a second delay caused a 20% drop in traffic." (https://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html)

Users will often also abandon shopping carts that take too long to load or checkout screens that process their credit cards too slowly. Amazon found that 100ms extra delay caused a 1% drop in sales (http://highscalability.com/blog/2009/7/25/latency-is-everywhere-and-it-costs-you-sales-how-to-crush-it.html). Google redid its test in 2009 and when it added 400ms of extra time to page results, it saw upwards of 0.5% less searches (https://ai.googleblog.com/2009/06/speed-matters.html).

You do not have to be running a website for performance to matter, though. You could define the performance of a car with an electric paddle shifter as requiring the paddle press to gear shift delay to be less than 800ms.

In the US, there is the Transportation Security Administration (TSA), which controls access to the nation's airports. The TSA will occasionally measure the time it takes for random passengers to get through airport security lines, to measure the performance of its current processes and system load. If the time gets too high, it can open additional lines, so that people do not miss their flights.

As software has grown as an industry, another important factor has become part of the UX of a service, which relates closely to the SRE world: security.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset