Memory usage

Sometimes, our usage of memory (either from the heap or the stack) is very naive and can exceed the limits in certain scenarios, leading to Out Of Memory (OOM) crashes. For heap, the common source of errors in non-garbage collected language runtimes are memory leaks; that is, memory chunks allocated but no longer reference-able. For stack, memory overflow can happen if, for example, using recursion naively and the scale of the data drastically increases.

In languages where actual memory allocation is managed by the runtime, one of the key system aspects that can cause bottlenecks is the Garbage Collector (GC) pauses. Let's start with an overview of what typically happens during garbage collection:

  • There are typically a few GC roots. These are code blocks which start off the program and could be the main driver program or static objects.
  • Each of these roots has functionality that allocates memory. The GC runtime builds a graph of allocation, each graph component rooted at one of those GC roots.
  • At periodic intervals, the GC runs for reclamation and does things in two phases:
    1. Phase 1: Runs from each of the root and marks each node as used.
    2. Phase 2: For those chunks that are not allocated, the GC reclaims the space.

In most of the initial GC algorithms, phase 1 involves a period where the application is locked out effectively as a stop-the-world activity. For low-latency applications, this is a problem, since when the stop-the-world phase runs, the application becomes unresponsive.

There have been many improvements in the GC algorithms and efforts to reduce this pause. For example, in Go v1.5, a new garbage collector (concurrent, tri-color, mark-sweep collector) was built based upon an idea first proposed by Dijkstra in 1978 (http://dl.acm.org/citation.cfm?id=359655). In the algorithm, every object is either white, grey, or black, and the heap is modeled as a graph of various roots. At the start of a GC cycle, all objects are white. Periodically, the GC then chooses a grey object, blackens it, and then scans it for pointers to other objects. If the scanned object is white, it turns that object grey. This process (or the GC cycle) continues until there are no more grey objects. At this point, white objects are known to be unreachable and are reclaimed. The key difference is that the mark phase does not need to stop the world. It happens concurrently with the application running. This is achieved by the runtime maintaining the invariant that no black object points to a white object. This means that there are no dangling pointers. Whenever a pointer on the heap is modified, the destination object is colored gray.

The result has shown to be a pause reduction of as much as 85 % (Alan Shreve's production server graphs (https://twitter.com/inconshreveable/status/620650786662555648)):

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset