To a great extent, the reinvigorated industrial attention on functional programming after many years of increased academic interest stems from the achieved capacities of electronics. On the one hand, the capabilities of contemporary computers make computer science findings considered more of a pure science savor thirty years ago quite practical today owing to an enormous increase in the speed and capacity of calculations. On the other hand, at the silicon level, the science has hit the physical limit for the further speeding-up of a single processor core operation. So the practical computation speed-up is happening along the lines of splitting a given amount of calculations between a group of processors working in a close concert.
The thing is that the brave new world of cheap multicore processors cannot afford expensive, error-prone, mentally onerous methods of programming. It is demanding the concurrency taming abstractions of a much higher level than the programming primitives for it that were developed by computer science in the era of the extensive growth of computing power.
These primitives have played their role in exposing the major problem behind concurrent calculations - such calculations are much less deterministic than we are accustomed to by dealing with sequential calculations. If the indeterminism in sequential calculations is usually associated with the imperfections of ambient physical environment materializing the former, the lack of determinism in concurrent calculations is intrinsic. This means that error-prone manipulation of programming primitives for synchronization between multiple co-executed calculations offers plenty of ways to shoot yourself in the foot.
The most prominent example of self-imposed indeterminism is deadlock (https://en.wikipedia.org/wiki/Deadlock) when concurrent program parts lacking proper synchronization over shared resources may, under some conditions, mutually lock each other.
Much trickier (and potentially much more dangerous) are cases where the concurrent code may misbehave only under extremely rare conditions. This may be really dangerous because such conditions may not introduce themselves during quality assurance and user acceptance testing. Then, the defective code basically carrying a "bomb" gets released into production, and in full accordance with Murphy's Law, blows up at the most inappropriate obstacles.
The functional programming promise for better quality concurrent programs is so dear to the industry today that many mainstream programming languages are getting supplied with add-on functional features.
Before we look deeper into what F# offers to tame concurrency indeterminism, let's take a look at the distinctive facets under the common concurrency umbrella that are important to recognize:
F# offers usage patterns taming concurrency using a uniform mechanism of asynchronous expressions/workflows (https://docs.microsoft.com/en-us/dotnet/articles/fsharp/language-reference/asynchronous-workflows). Concisely, an asynchronous expression, which is a particular specific form of the computation expression mentioned earlier, is written in this form:
async { expression }
It has the generic type of Async<'T>
. In turn, the Async
class has a bunch of functions that trigger actual asynchronous evaluation of the preceding expression following a few scenarios.
This is a very elegant and straightforward mechanism indeed. It allows you to conceal the fact that evaluation is going to be concurrent behind familiar forms of function composition. For example, take into account this innocuous code snippet:
[ for i in 91..100 -> async { return i * i }] // Async<int> list |> Async.Parallel // Async<int []> |> Async.RunSynchronously // int []
It performs a rather loaded function composition with intermediary types presented as line comments, where the first line using a list comprehension expression yields a list
of Async<int>
, which then with the help of the Async.Parallel
combinator fans out into Async<int []>
parallel calculations that, in turn, with another Async.RunSynchronously
combinator, join their asynchronously calculated expressions into the int []
array of results, yielding 10 numbers:
val it : int [] = [|8281; 8464; 8649; 8836; 9025; 9216; 9409; 9604; 9801; 10000|]
I will not attempt to prove to you that the preceding snippet will allow you to demonstrate performance gains from calculation parallelization. The preceding evaluation is so simple that the parallel snippet must in fact be slower than just sequential calculation analog:
[for i in 91..100 -> i * i]
This is because the parallel CPU asynchronous arrangement should introduce an overhead in comparison with a straightforward sequential list comprehension evaluation.
However, it all changes when we step into territory that is dear to enterprise development, namely begin dealing with parallel I/O. Performance gains from the I/O parallelization are going to be the subject of the following demo problem illustrating the design pattern enabled by F# asynchronous calculations.
Let me build an I/O-bound application that would allow the demonstration of a really overwhelming speedup when F# parallel I/O async pattern is applied. A good use case for this would be SQL Server with its scaling capabilities allowing you to reach persuasive improvements in comparison with multiple demos of concurrent web requests that F# authors and bloggers usually provide.
As an asynchronous concurrency vehicle, I'll be using the feature of the FSharp.Data.SqlClient
type provider's SQLCommandProvider (https://github.com/fsprojects/FSharp.Data.SqlClient/blob/master/src/SqlClient/SqlCommandProvider.fs), which allows asynchronous querying with the help of the AsyncExecute()
method.
I will create synchronous and asynchronous implementations of the same task of extracting data from SQL Server and then carrying out a performance comparison to detect and measure gains secured by F# asynchronous I/O usage pattern application.
For the sake of conciseness, the SQL-related part is going to be extremely simple. Executing the following T-SQL script against the instance of the (localdb)ProjectsV12
database engine accompanying the installation of Visual Studio 2013 or any other Microsoft SQL Server installation available to you, given it fulfills the type provider system requirements (http://fsprojects.github.io/FSharp.Data.SqlClient/), will create the necessary database components from scratch (Ch11_1.sql
):
CREATE DATABASE demo --(1) GO Use demo GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE PROCEDURE dbo.MockQuery --(2) AS BEGIN SET NOCOUNT ON; WAITFOR DELAY '00:00:01' SELECT 1 END GO
Here, the part marked (1)
creates and prepares for use the instance of the demo
database, and the part marked (2)
puts the dbo.MockQuery
stored procedure into this database. This stored procedure, which lacks input arguments, implements an extremely simple query. Specifically, first, it introduces a time delay of 1 second, mocking some data search activity and then it returns a single data row with the integer 1
as the execution result.
Now, I turn to commenting the F# script for the demo solution (Ch11_1.fsx
):
#I __SOURCE_DIRECTORY__ #r @"../packages/FSharp.Data.SqlClient.1.8.1/lib/net40/FSharp.Data.SqlClient.dll" open FSharp.Data open System.Diagnostics [<Literal>] let connStr = @"Data Source=(localdb)ProjectsV12;Initial Catalog=demo;Integrated Security=True" type Mock = SqlCommandProvider<"exec MockQuery", connStr> let querySync nReq = use cmd = new Mock() seq { for i in 1..nReq do yield (cmd.Execute() |> Seq.head) } |> Seq.sum let query _ = use cmd = new Mock() async { let! resp = cmd.AsyncExecute() return (resp |> Seq.head) } let queryAsync nReq = [| for i in 1..nReq -> i |] |> Array.map query |> Async.Parallel |> Async.RunSynchronously |> Array.sum let timing header f args = let watch = Stopwatch.StartNew() f args |> printfn "%s %s %d" header "result =" let elapsed = watch.ElapsedMilliseconds watch.Stop() printfn "%s: %d %s %d %s" header elapsed "ms. for" args "requests"
After loading the type provider package and opening the required libraries, the connStr
value decorated with the [<Literal>]
attribute signifies both design-time and execution-time SQL server connection strings. This line might require modifications if you are using some other version of database engine.
The next line delivers the type provider magic by introducing the SqlCommandProvider
provided type Mock
ensuring statically typed access to the results of the wrapped query that is represented by the stored procedure call, exec MockQuery
, over our connStr
connection string.
The following querySync
function ensures sequential execution of the cmd
command represented by the instance of the provided Mock
type given the number of times nReq
yields a sequence of query results (each is just 1
from the single row of the result set) and then aggregates this sequence with Seq.sum
. If we evaluate the querySync 10
expression, we may expect a bit above a 10 second delay in getting back a single number, 10
.
So far, so good. The following query
function takes any argument and returns an asynchronous computation of type Async<int>
. I put this function within the combined expression wrapped into the queryAsync
function, effectively representing the concurrent variant of querySync
. Specifically, the array of nReq
numbers is mapped into an Async<int>
array of the same size, and then they are all fanned out by Async.Parallel
, joined back after completion with Async.RunSynchronously
and eventually aggregated by Array.sum
into a single number.
The last piece is an instrumentation higher-order timing
function that just measures and outputs the evaluation of the f args
computation duration in milliseconds.
Alright; now, it is time to take our script for a spin. I put the code into FSI and measure the duration of executing querySync
and queryAsync
100 times. You can see the measurement results in the following screenshot:
Are you as impressed as I am? Results show that I/O parallelization in the case of SQL queries allowed improved performance approximately 100-fold!
This demo is quite persuasive and I strongly recommend that you master and use this and other F# idiomatic concurrency patterns in your practical work.