Addressing run-time problems

The mantra "if it compiles, it works" helps followers score amazingly well in time-to-market ratings for enterprise software development.

Taking Jet.com as an example of building green field e-commerce platform implementation, it has really condensed the path from zero to minimum viable product (MVP) in less than a year. Release of the platform to the production mode took place in a bit more than a year from the reception.

Does this mean following a functional-first approach is a software development silver bullet? Surely not on an absolute scale, although on a relative scale, the improvements are just great.

Why is the success not exhaustive? The thing is that the practice requires transition from gory ideas to mundane implementation issues. No matter how accurate our implementations are, there are always dark corners exist where unexpected problems may lurk.

Let me demonstrate this with a sample taken from F# enterprise development practice at Jet.com. Jet represents an innovative e-commerce platform, bringing together many business areas, such as Internet ordering, retail selling, warehousing, finance, accounting, transportation, you name it. Each of these areas usually carries its own unique metadata classifications; so, in order to run them side by side, one of the most common operations within the implementation is mapping. And the generally accepted practice of using unique non-clashing identifications is based on GUIDs or Global Unique Identifiers (https://en.wikipedia.org/wiki/Globally_unique_identifier).

Realistically assuming that quite frequently, enterprise software deals with dictionaries and caches using GUIDs as access keys, let's look at how good the core .NET library System.Guid implementation for the purpose would be.

Here goes a quite simplistic explorative implementation using a dictionary that has instances of the System.Guid type as keys. I created a simple dictionary based on the standard F# core library implementation having type IDictionary<Guid,int>. I populated it with the size number of pairs (Guid,int), just for the sake of simplicity. Now, I will imitate a big trials number of random accesses to the dictionary using array keys as a level of indirection and measuring the performance. The following snippet shows the composition of the required code pieces (Ch13_4.fsx):

open System 
open System.Collections.Generic 
let size = 1000 
let keys = Array.zeroCreate<Guid> size 
let mutable dictionary : IDictionary<Guid,int> = 
    Unchecked.defaultof<IDictionary<Guid,int>> 
let generate () = 
    for i in 0..(size-1) do 
        keys.[i] <- Guid.NewGuid() 
    dictionary <- seq { for i in 0..(size-1) -> (keys.[i],i) } |> dict 
generate() 
let trials = 10000000 
let rg = Random() 
let mutable result = 0 
for i in 0..trials-1 do 
    result <- dictionary.Item(keys.[rg.Next(size-1)]) 

Running this snippet through FSI with timing turned on yields the performance indicators shown in the following screenshot (only the valuable output is shown for brevity):

Addressing run-time problems

Using native System.Guid to access a dictionary

10 million accesses for 6.445 seconds translates into a bit higher than 1.5 million accesses per second. Not too fast. Let's take it for the baseline. Also, a worrying sign is a number of garbage collections that took place: 287 per 10000 accesses is not light.

Without digging dipper into the causes of the observed code behavior here, let me just show the results of the findings performed for Jet.com in an attempt to improve the watermark. I will introduce a simple change instead of using the genuine System.Guid type that is a quite complicated Windows system data structure as a dictionary key, I will use the representation of the GUID value as a hexadecimal string that is leftover when the canonical presentation is stripped of dashes. For example, the f4d1734c-1e9e-4a25-b8d9-b7d96f48e0f GUID will be represented as a f4d1734c1e9e4a25b8d9b7d96f48e0f string. This will require minimal changes to the previous snippet (Ch13_4.fsx):

let keys' = Array.zeroCreate<string> size 
let mutable dictionary' : IDictionary<string,int> = 
    Unchecked.defaultof<IDictionary<string,int>> 
let generate' () = 
    for i in 0..(size-1) do 
        keys'.[i] <- keys.[i].ToString("N") 
    dictionary' <- seq { for i in 0..(size-1) -> (keys'.[i],i) } |> dict 
generate'() 
for i in 0..trials-1 do 
    result <- dictionary'.Item(keys'.[rg.Next(size-1)]) 

Here, I will just create a new keys' indirection layer made from corresponding parts of keys via a simple data conversion. Turning to FSI with this change brings a big surprise reflected in the following screenshot:

Addressing run-time problems

Switching from System.Guid to string in order to access a dictionary

Compared to the baseline, the new access rate constitutes 16.4 millions accesses per second almost 11 times better! Also, garbage collection experiences a five-fold improvement.

Now remember that mappings based on System.Guid are ubiquitous for the platform and you can imagine the amount of impact from the above simple change to the overall platform performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset