Foreword

by Vance Morrison

Kids these days have no idea how good they have it! At the risk of being branded as an old curmudgeon, I must admit there is more than a kernel of truth in that statement, at least with respect to performance analysis. The most obvious example is that “back in my day” there weren’t books like this that capture both the important “guiding principles” of performance analysis as well as the practical complexities you encounter in real world examples. This book is a gold mine and is worth not just reading, but re-reading as you do performance work.

For over 10 years now, I have been the performance architect for the .NET Runtime. Simply put, my job is to make sure people who use C# and the .NET runtime are happy with the performance of their code. Part of this job is to find places inside the .NET Runtime or its libraries that are inefficient and get them fixed, but that is not the hard part. The hard part is that 90% of the time the performance of applications is not limited by things under the runtime’s control (e.g., quality of the code generation, just in time compilation, garbage collection, or class library functionality), but by things under the control of the application developer (e.g., application architecture, data structure selection, algorithm selection, and just plain old bugs). Thus my job is much more about teaching than programming.

So a good portion of my job involves giving talks and writing articles, but mostly acting as a consultant for other teams who want advice about how to make their programs faster. It is in the latter context that I first encountered Ben Watson over 6 years ago. He was “that guy on the Bing team” who always asked the non-trivial questions (and finds bugs in our code not his). Ben was clearly a “performance guy.” It is hard to express just how truly rare that is. Probably 80% of all programmers will probably go through most of their career having only the vaguest understanding of the performance of the code they write. Maybe 10% care enough about performance that they learned how to use a performance tool like a profiler at all. The fact that you are reading this book (and this Foreword!) puts you well into the elite 1% that really care about performance and really want to improve it in a systematic way. Ben takes this a number of steps further: He is not only curious about anything having to do with performance, he also cares about it deeply enough that he took the time to lay it out clearly and write this book. He is part of the .0001%. You are learning from the best.

This book is important. I have seen a lot of performance problems in my day, and (as mentioned) 90% of the time the problem is in the application. This means the problem is in your hands to solve. As a preface to some of my talks on performance I often give this analogy: Imagine you have just written 10,000 lines of new code for some application, and you have just gotten it to compile, but you have not run it yet. What would you say is the probability that the code is bug free? Most of my audience quite rightly says zero. Anyone who has programmed knows that there is always a non-trivial amount of time spent running the application and fixing problems before you can have any confidence that the program works properly. Programming is hard, and we only get it right through successive refinement. Okay, now imagine that you spent some time debugging your 10,000-line program and now it (seemingly) works properly. But you also have some rather non-trivial performance goals for your application. What you would say the probability is that it has no performance issues? Programmers are smart, so my audience quickly understands that the likelihood is also close to zero. In the same way that there are plenty of runtime issues that the compiler can’t catch, there are plenty of performance issues that normal functional testing can’t catch. Thus everyone needs some amount of “performance training” and that is what this book provides.

Another sad reality about performance is that the hardest problems to fix are the ones that were “baked into” the application early in its design. That is because that is when the basic representation of the data being manipulated was chosen, and that representation places strong constraints on performance. I have lost count of the number of times people I consult with chose a poor representation (e.g., XML, or JSON, or a database) for data that is critical to the performance of their application. They come to me for help very late in their product cycle hoping for a miracle to fix their performance problem. Of course I help them measure and we usually can find something to fix, but we can’t make major gains because that would require changing the basic representation, and that is too expensive and risky to do late in the product cycle. The result is the product is never as fast as it could have been with just a small amount of performance awareness at the right time.

So how do we prevent this from happening to our applications? I have two simple rules for writing high-performance applications (which are, not coincidentally, a restatement of Ben’s rules):

  1. Have a Performance Plan
  2. Measure, Measure, Measure

The “Have a Performance Plan” step really boils down to “care about perf.” This means identifying what metric you care about (typically it is some elapsed time that human beings will notice, but occasionally it is something else), and identifying the major operations that might consume too much of that metric (typically the “high volume” data operation that will become the “hot path”). Very early in the project (before you have committed to any large design decision) you should have thought about your performance goals, and measured something (e.g., similar apps in the past, or prototypes of your design) that either gives you confidence that you can reach your goals or makes you realize that hitting your perf goals may not be easy and that more detailed prototypes and experimentation will be necessary to find a better design. There is no rocket science here. Indeed some performance plans take literally minutes to complete. The key is that you do this early in the design so performance has a chance to influence early decisions like data representation.

The “Measure, Measure, Measure” step is really just emphasizing that this is what you will spend most of your time doing (as well as interpreting the results). As “Mad-Eye” Moody would say, we need “constant vigilance.” You can lose performance at pretty much any part of the product cycle from design to maintenance, and you can only prevent this by measuring again and again to make sure things stay on track. Again, there is no rocket science needed—just the will to do it on an ongoing basis (preferably by automating it).

Easy right? Well here is the rub. In general, programs can be complex and run on complex pieces of hardware with many abstractions (e.g., memory caches, operating systems, runtimes, garbage collectors, etc.), and so it really is not that surprising that the performance of such complex things can also be complex. There can be a lot of important details. There is an issue of errors, and what to do when you get conflicting or (more often) highly variable measurements. Parallelism, a great way to improve the performance of many applications also makes the analysis of that performance more complex and subject to details like CPU scheduling that previously never mattered. The subject of performance is a many-layered onion that grows ever more complex as you peel back the layers.

Taming that complexity is the value of this book. Performance can be overwhelming. There are so many things that can be measured as well as tools to measure them, and it is often not clear what measurements are valuable, and what the proper relationship among them is. This book starts you off with the basics (set goals that you care about), and points you in the right direction with a small set of tools and metrics that have proven their worth time and time again. With that firm foundation, it starts “peeling back the onion” to go into details on topics that become important performance considerations for some applications. Topics include things like memory management (garbage collection), “just in time” (JIT) compilation, and asynchronous programming. Thus it gives you the detail you need (runtimes are complex, and sometimes that complexity shows through and is important for performance), but in an overarching framework that allows you to connect these details with something you really care about (the goals of your application).

With that, I will leave the rest in Ben’s capable hands. The goal of my words here are not to enlighten but simply motivate you. Performance investigation is a complex area of the already complex area of computer science. It will take some time and determination to become proficient in it. I am not here to sugar-coat it, but I am here to tell you that it is worth it. Performance does matter. I can almost guarantee you that if your application is widely used, then its performance will matter. Given this importance, it is almost a crime that so few people have the skills to systematically create high-performance applications. You are reading this now to become a member of this elite group. This book will make it so much easier.

Kids these days—they have no idea how good they have it!

Vance Morrison

Performance Architect for the .NET Runtime

Microsoft Corporation

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset