Higher-Level Performance

This book is primarily concerned with helping you understand the basics of performance from a nuts-and-bolts perspective. It is critical to understand the cost of your building blocks before putting together larger applications. Everything covered so far in this book applies to most .NET application types, including those discussed in this appendix.

This appendix will take a step higher and give you a few brief tips for popular types of applications. I will not cover these topics in nearly as much detail as in the rest of this book, so think of this as a general overview to inspire further research. For the most part, these tips are not related to .NET, per se, but are architecture-, domain-, or library-specific.

ASP.NET

  • Disable unneeded HTTP modules.
  • Remove unused View Engines.
  • Do not compile as debug in production (Check for presence of <compilation debug="true"/>).
  • Reduce roundtrips between browser and server.
  • Ensure page buffering is enabled (it is enabled by default).
  • Understand and use caches aggressively:
    • OutputCache: Caches page output.
    • Cache: Caches arbitrary objects however you desire.
  • Understand how large your pages are (from a client perspective).
  • Remove unnecessary characters and whitespace from pages.
  • Use HTTP compression.
  • Use client-side validation to save on round-trips, and server-side validation to verify.
  • Disable or limit ViewState to small objects. If you must use it, compress it.
  • Turn off session state if it is not needed.
  • Do not use the Page.DataBind method.
  • Pool connections to backend servers, such as databases.
  • Precompile the web-site.
  • Check Page.IsPostBack property to run code that only needs to happen once per page, such as for initialization.
  • Prefer Server.Transfer instead of Response.Redirect.
  • Do not have long-running tasks.
  • Avoid lock contention or blocking threads for any reason.

ADO.NET

  • Store connection, command, parameter, and other database-related objects in reused fields rather than re-instantiating them on each call of a frequently invoked method.
  • Pool network connections.
  • Ensure database is properly structured and indexed.
  • Reduce round-trip queries to the backend database.
  • Cache whatever data you can locally, in memory.
  • Use stored procedures wherever possible.
  • Use paging for large data sets (i.e., do not return the entire data set at once).
  • Batch requests if possible.
  • Use DataView objects on top of DataSet objects rather than re-querying the same information.
  • Use DataReader if you can live with a short-lived, forward-only iterator view of the data.
  • Profile query performance using SQL Query Analzyer.

WPF

  • Use the latest version of .NET—significant performance improvements have happened through the years.
  • Never do significant processing on the UI thread.
  • Ensure there are not any binding errors.
  • Have only the visuals you absolutely need. Too many transformations and layers will slow down rendering.
  • Reduce the size and depth of the visual tree.
  • Use the smallest animation frame rate you can get away with.
  • Use virtual views and lists to render only visible objects.
  • Consider deferred scrolling for long lists, if necessary.
  • StreamGeometry is faster than PathGeometry, but supports fewer features.
  • Drawing objects are faster than Shape objects, but support fewer features.
  • Update render transformations instead of replacing them, where possible.
  • Explicitly force WPF to load images to the size you want, if they will be displayed smaller than full-size.
  • Remove event handlers from objects to ensure they get garbage collected.
  • Override DependencyProperty metadata to customize when changing values will cause a re-render.
  • Freeze objects when you want to avoid change notification overhead.
  • Prefer static resources over dynamic resources.
  • Bind to CLR objects with few properties, or create wrapper objects to expose a minimal set of properties.
  • Disable hit testing for large 3-D objects if not needed.
  • Recompile for Universal Windows Platform, targeting Windows 10 to achieve significant performance improvements for free.

    Big O Notation

    At a layer above direct performance profiling lies algorithmic analysis. This is usually done in terms of abstract operations, relative to the size of the problem. Computer science has a standard way of referring to the cost of algorithms, called “Big O” analysis.

Big O

Big O notation, also known as asymptotic notation, is a way of summarizing the performance of algorithms based on problem size. The problem size is usually designated n. The “Big O”-ness of an algorithm is often referred to as its complexity. The term asymptotic is used as it describes the behavior of a function as its input size approaches infinity.

As an example, consider an unsorted array that contains a value we need to search for. Because it is unsorted, we will have to search every element until we find the value we are looking for. If the array is of size n, we will need to search, worst case, n elements. We say, therefore, that this linear search algorithm has a complexity of O(n).

That is the worst case. On average, however, the algorithm will need to look at n/2 elements. We could be more accurate and say the algorithm is, on average, O(n/2), but this is not actually a significant change as far as the growth factor (n) is concerned. Constants are dropped, leaving us with the same O(n) complexity.

Big O notation is expressed in terms of functions of n, where n is the input size, which is determined by the algorithm and the data structure it operates on. For a collection, it could be the number of items in the collection; for a string search algorithm, it is the length of the respective strings.

Big O notation is concerned with how the time required to perform an algorithm grows with ever-larger input sizes. With our array example, we expect that if the array were to double in length, then the time required to search the array would also double. This implies that the algorithm has linear performance characteristics.

An algorithm with complexity of O(n2) would exhibit worse than linear performance. If the input doubles, the time is quadrupled. If problem size increases by a factor of 8, then the time increases by a factor of 64, always squared. This type of algorithm exhibits quadratic complexity. A good example of this is the bubble sort algorithm. (In fact, most naive sorting algorithms have O(n2) complexity.)

private static void BubbleSort(int[] array)
{
  bool swapped;
  do
  {
    swapped = false;
    for (int i = 1; i < array.Length; i++)
    {
      if (array[i - 1] > array[i])
      {
        int temp = array[i - 1];
        array[i - 1] = array[i];
        array[i] = temp;
        swapped = true;
      }
    }
  } while (swapped);
}

Any time you see nested loops, it is quite likely the algorithm is going to be quadratic or polynomial (if not worse). In bubble sort’s case, the outer loop can run up to n times while the inner loop examines up to n elements on each iteration, therefore the complexity is O(n2).

When analyzing your own algorithms, you may come up with a formula that contains multiple factors, as in O(8n2 + n + C) (a quadratic portion multiplied by 8, a linear portion, and a constant time portion). For the purposes of Big O notation, only the most significant factor is kept and multiplicative constants are ignored. This algorithm would be regarded as O(n2). Remember, too, that Big O notation is concerned with the growth of the time as the problem size approaches infinity. Even though 8n2 is 8 times larger than n2, it is not very relevant compared to the growth of the n2 factor, which far outstrips every other factor for large values of n. Conversely, if n is small, the difference between O(n log n), O(n2), or O(2n) is trivial and uninteresting. Note that, you can have complex significant factors such as O(n22n), and neither component involving n is removed (unless it really is trivial).

Many algorithms have multiple inputs and their complexity can be denoted with multiple variables, e.g., O(mn) or O(m + n). Many graph algorithms, for example, depend on the number of edges and the number of vertices.

The most common types of complexity are:

  • O(1) (Constant): The time required does not depend on size of the input. Many hash tables have O(1) complexity.
  • O(log n) (Logarithmic): Time increases as a fraction of the input size. Any algorithm that cuts its problem’s space in half on each iteration exhibits logarithmic complexity. Note that there is no specific base for this log.
  • O(n) (Linear): Time increases in proportion with input size.
  • O(n log n) (Loglinear): Time increases quasilinearly, that is, the time is dominated by a linear factor, but this is multiplied by a fraction of the input size.
  • O(n2) (Quadratic): Time increases with the square of the input size.
  • O(nC) (Polynomial): C is greater than or equal to 2.
  • O(Cn) (Exponential): C is greater than 1.
  • O(n!) (Factorial): Try every permutation.

Algorithmic complexity is usually described in terms of its average and worst-case performance. Best-case performance is not very interesting because, for many algorithms, luck can be involved (e.g., it does not really matter for our analysis that the best-case performance of linear search is O(1) because that means it just happened to get lucky).

The following graph shows how fast time can grow based on problem size. Note that the difference between O(1) and O(log n) is almost indistinguishable even out to relatively large problem sizes. An algorithm with O(n!) complexity is almost unusable with anything but the smallest problem sizes.

Problem size vs. growth rate for various types of algorithms.
Problem size vs. growth rate for various types of algorithms.

Though time is the most common dimension of complexity, space (memory usage) can also be analyzed with the same methodology. For example, most sorting algorithms are O(log n) in time, but are O(n) in space. Few data structures use more space, complexity-wise, than the number of elements in the data structure.

Common Algorithms and Their Complexity

Sorting

  • Quicksort: O(n log n), O(n2) worst case
  • Merge sort: O(n log n)
  • Heap sort: O(n log n)
  • Bubble sort: O(n2)
  • Insertion sort: O(n2)
  • Selection sort: O(n2)

Graphs

  • Depth-first search: O(E + V) (E = Edges, V = Vertices)
  • Breadth-first search: O(E + V)
  • Shortest-path (using Min-heap): O((E + V) log V)

Searching

  • Unsorted array: O(n)
  • Sorted array with binary search: O(log n)
  • Binary search tree: O(log n)
  • Hash table: O(1)

Special Case

  • Computing every permutation of a string: O(n!)
  • Traveling salesman: O(n!) (Worst-case. There is actually a way to solve this in O(n22n) using dynamic programming techniques.)

Often, O(n!) is really just shorthand for “brute force, try every possibility.”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset