Exploring concurrent computations

To a great extent, the reinvigorated industrial attention on functional programming after many years of increased academic interest stems from the achieved capacities of electronics. On the one hand, the capabilities of contemporary computers make computer science findings considered more of a pure science savor thirty years ago quite practical today owing to an enormous increase in the speed and capacity of calculations. On the other hand, at the silicon level, the science has hit the physical limit for the further speeding-up of a single processor core operation. So the practical computation speed-up is happening along the lines of splitting a given amount of calculations between a group of processors working in a close concert.

The feature review

The thing is that the brave new world of cheap multicore processors cannot afford expensive, error-prone, mentally onerous methods of programming. It is demanding the concurrency taming abstractions of a much higher level than the programming primitives for it that were developed by computer science in the era of the extensive growth of computing power.

These primitives have played their role in exposing the major problem behind concurrent calculations - such calculations are much less deterministic than we are accustomed to by dealing with sequential calculations. If the indeterminism in sequential calculations is usually associated with the imperfections of ambient physical environment materializing the former, the lack of determinism in concurrent calculations is intrinsic. This means that error-prone manipulation of programming primitives for synchronization between multiple co-executed calculations offers plenty of ways to shoot yourself in the foot.

The most prominent example of self-imposed indeterminism is deadlock (https://en.wikipedia.org/wiki/Deadlock) when concurrent program parts lacking proper synchronization over shared resources may, under some conditions, mutually lock each other.

Much trickier (and potentially much more dangerous) are cases where the concurrent code may misbehave only under extremely rare conditions. This may be really dangerous because such conditions may not introduce themselves during quality assurance and user acceptance testing. Then, the defective code basically carrying a "bomb" gets released into production, and in full accordance with Murphy's Law, blows up at the most inappropriate obstacles.

The functional programming promise for better quality concurrent programs is so dear to the industry today that many mainstream programming languages are getting supplied with add-on functional features.

Before we look deeper into what F# offers to tame concurrency indeterminism, let's take a look at the distinctive facets under the common concurrency umbrella that are important to recognize:

  • Synchronous and asynchronous: The first one, given a few expressions to evaluate, does not start the evaluation of the next expression before the previous one has been evaluated. The second one allows you to move between some half-evaluated expressions.
  • Concurrent and parallel: Parallelism assumes simultaneous evaluation of more than one expression using multiple processing units, while concurrency may be asynchronous partial evaluation of a few expressions by a single processing unit.
  • Interactive and reactive: The former drives the external environment, while the latter responds to external environment demands.

F# offers usage patterns taming concurrency using a uniform mechanism of asynchronous expressions/workflows (https://docs.microsoft.com/en-us/dotnet/articles/fsharp/language-reference/asynchronous-workflows). Concisely, an asynchronous expression, which is a particular specific form of the computation expression mentioned earlier, is written in this form:

async { expression } 

It has the generic type of Async<'T>. In turn, the Async class has a bunch of functions that trigger actual asynchronous evaluation of the preceding expression following a few scenarios.

This is a very elegant and straightforward mechanism indeed. It allows you to conceal the fact that evaluation is going to be concurrent behind familiar forms of function composition. For example, take into account this innocuous code snippet:

[ for i in 91..100 -> async { return i * i }] // Async<int> list 
|> Async.Parallel // Async<int []> 
|> Async.RunSynchronously // int []  

It performs a rather loaded function composition with intermediary types presented as line comments, where the first line using a list comprehension expression yields a list of Async<int>, which then with the help of the Async.Parallel combinator fans out into Async<int []> parallel calculations that, in turn, with another Async.RunSynchronously combinator, join their asynchronously calculated expressions into the int [] array of results, yielding 10 numbers:

val it : int [] = 
  [|8281; 8464; 8649; 8836; 9025; 9216; 9409; 9604; 9801;
    10000|] 

I will not attempt to prove to you that the preceding snippet will allow you to demonstrate performance gains from calculation parallelization. The preceding evaluation is so simple that the parallel snippet must in fact be slower than just sequential calculation analog:

[for i in 91..100 -> i * i] 

This is because the parallel CPU asynchronous arrangement should introduce an overhead in comparison with a straightforward sequential list comprehension evaluation.

However, it all changes when we step into territory that is dear to enterprise development, namely begin dealing with parallel I/O. Performance gains from the I/O parallelization are going to be the subject of the following demo problem illustrating the design pattern enabled by F# asynchronous calculations.

The demo problem

Let me build an I/O-bound application that would allow the demonstration of a really overwhelming speedup when F# parallel I/O async pattern is applied. A good use case for this would be SQL Server with its scaling capabilities allowing you to reach persuasive improvements in comparison with multiple demos of concurrent web requests that F# authors and bloggers usually provide.

As an asynchronous concurrency vehicle, I'll be using the feature of the FSharp.Data.SqlClient type provider's SQLCommandProvider (https://github.com/fsprojects/FSharp.Data.SqlClient/blob/master/src/SqlClient/SqlCommandProvider.fs), which allows asynchronous querying with the help of the AsyncExecute() method.

I will create synchronous and asynchronous implementations of the same task of extracting data from SQL Server and then carrying out a performance comparison to detect and measure gains secured by F# asynchronous I/O usage pattern application.

The demo solution

For the sake of conciseness, the SQL-related part is going to be extremely simple. Executing the following T-SQL script against the instance of the (localdb)ProjectsV12 database engine accompanying the installation of Visual Studio 2013 or any other Microsoft SQL Server installation available to you, given it fulfills the type provider system requirements (http://fsprojects.github.io/FSharp.Data.SqlClient/), will create the necessary database components from scratch (Ch11_1.sql):

CREATE DATABASE demo --(1) 
GO 
 
Use demo  
GO  
 
SET ANSI_NULLS ON 
GO 
 
SET QUOTED_IDENTIFIER ON 
GO 
 
CREATE PROCEDURE dbo.MockQuery --(2) 
AS 
BEGIN 
  SET NOCOUNT ON; 
  WAITFOR DELAY '00:00:01' 
  SELECT 1 
END 
GO 

Here, the part marked (1) creates and prepares for use the instance of the demo database, and the part marked (2) puts the dbo.MockQuery stored procedure into this database. This stored procedure, which lacks input arguments, implements an extremely simple query. Specifically, first, it introduces a time delay of 1 second, mocking some data search activity and then it returns a single data row with the integer 1 as the execution result.

Now, I turn to commenting the F# script for the demo solution (Ch11_1.fsx):

#I __SOURCE_DIRECTORY__ 
#r @"../packages/FSharp.Data.SqlClient.1.8.1/lib/net40/FSharp.Data.SqlClient.dll" 
open FSharp.Data 
open System.Diagnostics 
 
[<Literal>] 
let connStr = @"Data Source=(localdb)ProjectsV12;Initial Catalog=demo;Integrated Security=True" 
 
type Mock = SqlCommandProvider<"exec MockQuery", connStr> 
 
let querySync nReq = 
    use cmd = new Mock() 
    seq { 
        for i in 1..nReq do 
            yield (cmd.Execute() |> Seq.head) 
        } |> Seq.sum 
 
let query _ = 
    use cmd = new Mock() 
    async { 
        let! resp = cmd.AsyncExecute() 
        return (resp |> Seq.head) 
    } 
 
let queryAsync nReq = 
    [| for i in 1..nReq -> i |] 
    |> Array.map query 
    |> Async.Parallel 
    |> Async.RunSynchronously 
    |> Array.sum 
 
let timing header f args = 
    let watch = Stopwatch.StartNew() 
    f args |> printfn "%s %s %d" header "result =" 
    let elapsed = watch.ElapsedMilliseconds 
    watch.Stop() 
    printfn "%s: %d %s %d %s" header elapsed "ms. for" args       "requests" 

Tip

Consider that the preceding F# code taken literally will not compile because of a few line wraps introduced by typesetting. Instead, use the code part accompanying the book as the source of working F# code.

After loading the type provider package and opening the required libraries, the connStr value decorated with the [<Literal>] attribute signifies both design-time and execution-time SQL server connection strings. This line might require modifications if you are using some other version of database engine.

The next line delivers the type provider magic by introducing the SqlCommandProvider provided type Mock ensuring statically typed access to the results of the wrapped query that is represented by the stored procedure call, exec MockQuery, over our connStr connection string.

The following querySync function ensures sequential execution of the cmd command represented by the instance of the provided Mock type given the number of times nReq yields a sequence of query results (each is just 1 from the single row of the result set) and then aggregates this sequence with Seq.sum. If we evaluate the querySync 10 expression, we may expect a bit above a 10 second delay in getting back a single number, 10.

So far, so good. The following query function takes any argument and returns an asynchronous computation of type Async<int>. I put this function within the combined expression wrapped into the queryAsync function, effectively representing the concurrent variant of querySync. Specifically, the array of nReq numbers is mapped into an Async<int> array of the same size, and then they are all fanned out by Async.Parallel, joined back after completion with Async.RunSynchronously and eventually aggregated by Array.sum into a single number.

The last piece is an instrumentation higher-order timing function that just measures and outputs the evaluation of the f args computation duration in milliseconds.

Alright; now, it is time to take our script for a spin. I put the code into FSI and measure the duration of executing querySync and queryAsync 100 times. You can see the measurement results in the following screenshot:

The demo solution

Measuring synchronous versus asynchronous SQL querying

Are you as impressed as I am? Results show that I/O parallelization in the case of SQL queries allowed improved performance approximately 100-fold!

This demo is quite persuasive and I strongly recommend that you master and use this and other F# idiomatic concurrency patterns in your practical work.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset