WHAT'S IN THIS CHAPTER?
Like many other .NET languages, F# provides not only a set of primitive types for describing the basic atoms of data (strings, integers, and so on), but also a set of types for gathering those atoms into larger structures. These composite types are also built into the language, and in some cases directly mirror capabilities found within the .NET Base Class Library or the CLR.
In many cases, F# developers will find that these composite types can serve where normally developers in other languages would have to create a complete class type. For example, as with many functional languages, F# developers can find that a collection of functions plus a tuple type instance or list instance can be sufficient to model the problem domain, without having to create a standalone class to represent the data. Much of the F# library is built in precisely this manner, wrapped into a module (see Chapter 11 for lexical scoping).
The simplest composite type to understand is the option type, which is effectively an either/or type similar in some ways to the Boolean data type, but with an entirely different purpose and use.
Options are similar to Booleans in that there are only two acceptable values for a given option, None
, indicating an absence of value, and Some
, which indicates a value. As can be easily inferred from the description here, options are used as a replacement for the null-checking that object-oriented languages traditionally use to determine the difference between "nothing" and "something":
let nothing : string option = None let something : string option = Some("Ted Neward") System.Console.WriteLine("nothing = {0}", nothing) System.Console.WriteLine("something = {0}", something.Value)
As is demonstrated here, options are tied to another type, the actual type "carried" as part of the Some
value, which in this case, is a string. (The Option
type is a generic type, which should be familiar to .NET developers; generics in F# are discussed in Chapter 10). F# allows us two different ways to specify an option type, one using the T option syntax (as shown here), or to use a more .NET/BCL-like syntax, which may seem more comfortable to the C# crowd:
let nothing : Option<string> = None let something : Option<string> = Some("Ted Neward") System.Console.WriteLine("nothing = {0}", nothing) System.Console.WriteLine("something = {0}", something.Value)
It's important to note that the two syntaxes are entirely equivalent and produce exactly the same IL; the former is preferred among functional-language programmers, but the latter may be more comfortable for object-oriented developers until the functional style becomes more intuitive. (We'll see similar kinds of syntax when looking at list declarations later in this chapter.)
The option type serves as a simple type designed to differentiate between (as its name implies) "some" data and "no" data. Pragmatically, Option
acts as a measure to avoid developers having to worry about null-object values and the inevitable NullReferenceException
that gets thrown when forgetting to test for null before dereferencing the possibly-null value. Because None
is an object instance just as any other object is,[5] None
is perfectly acceptable as a target for comparisons and method calls:
let possibleValue = if (System.DateTime.Now.Millisecond % 2) = 0 then None else Some("Have a happy day!") if possibleValue.IsSome then System.Console.WriteLine("Ah, we got a good value. Good!") System.Console.WriteLine(possibleValue.Value)
This makes the use of Option
vastly superior to the use of null and goes a long way toward removing a significant source of exceptions thrown at runtime. (Consider this a prescriptive piece of advice: Prefer Option
to null as a return type indicating a lack of data or response.)
As is shown in the preceding code, testing an option value for Some
-ness can be done several different ways. The Option
type has several properties defined on it directly, such as the IsSome
property already shown, which returns true if the underlying Option
is a Some
of some value, just as IsNone
returns true if the underlying Option
instance is set to None
. Accessing the value of a Some
is done through the Value
property, but it is important to note that for None, Value
will return null, so only access it when IsSome
is true.
Any of the following code, when invoked, will throw an exception, so careless use of None
can still create NullReferenceExceptions
:
let nothing : string option = None if nothing.Equals(None) then System.Console.WriteLine("None.Equals(None)") System.Console.WriteLine(nothing.GetHashCode()) System.Console.WriteLine(nothing.ToString())
F# does this for efficiency and performance reasons, so although the compiler does attempt to make None appear as if it is a legitimate value, be careful when trying to peek too far under the hood. More often than not, using IsSome
(or relying on the functions described below) is the preferred approach to test for Some
-ness or None
-ness.
F# provides a suite of functions in the Option
module that also provide access to the members of the Option
type, and some enhanced capabilities over using the properties directly. For example, Option.iter
is a method that executes a passed-in function against an Option
instance, but only if that Option
instance is a Some
rather than a None
(in which case it does nothing):
let possibleValue = if (System.DateTime.Now.Millisecond % 2) = 0 then None else Some("Have a happy day!") Option.iter (fun o -> System.Console.WriteLine(o.ToString())) possibleValue
Following are some of the Option
module methods:
NAME | EFFECT |
---|---|
| Executes the function |
| Returns |
| Executes the function |
| Executes the function |
|
|
| Executes the function |
| Returns true if o is |
| Returns false if o is |
| Converts the |
| Converts the |
Observant readers will notice that many of these methods make more sense on collection classes, and much of the Option
API is designed around the concept that an Option
is a single-slot collection. This also creates a consistent API when working with other composite types (such as Lists
or Arrays
).
Tuples group two or more unnamed values into an ordered collection of values described as a single value:
let myName = ("Ted", "Neward") let myDescription = ("Ted", "Neward", 38, 98053)
Tuples are described by comma-separated values in between parentheses, so in the preceding snippet, the two tuples listed are of string and string
type, and of string and string and int and int
or more accurately, (string * string)
and (string * string * int * int)
in the F# syntax. These are two separate and unique types, even if they have no formal name assigned to them so that any attempt to use a (string * string)
where a (string * string * int * int)
is expected will fail.
Any tuple whose list of types — in both number and order — is equivalent to another is thus considered to be of the same type. Thus, in the following:
let myName = ("Ted", "Neward") let herName = ("Sarah", "Michelle", "Gellar")
the two values, myName
and herName
, are of entirely separate types and therefore incompatible, whereas in:
let myName = ("Ted", "Neward") let cityState = ("Phoenix", "AZ")
the values myName
and cityState
, despite representing obviously different intended values, are both (string * string)
, and therefore are type-equivalent. Formally, this kind of equivalence is known as structural equivalence, meaning that the two values are equal in terms of their contents, even if they may or may not be of the same type.[6] Thus, when the following is run:
System.Console.WriteLine("myName = herName? {0}", myName.GetType().Equals(herName.GetType())) System.Console.WriteLine("myName = cityState? {0}", myName.GetType().Equals(cityState.GetType()))
the resulting printed values will be False and True.
Accessing the individual values inside the tuple is typically done through one of three methods. First, the F# library provides functions (fst
and snd
) that return individual elements from inside the tuple. Second, pattern-matching can be used to extract elements from within the tuple (and is described in Chapter 6), or third, a new set of individual values can be bound based on values from inside the tuple.
The first approach uses F# library functions fst
and snd
("first" and "second") to extract either the first or second element from inside the tuple value:
let me = ("Ted", "Neward") let firstName = fst me let lastName = snd me System.Console.WriteLine("Hello, {0} {1}", firstName, lastName)
Each returns only that particular element, and restrictions on the fst
and snd
functions require that the tuple be a pair (2-element tuple). For this reason, any triple (3-element tuple) or tuple with elements beyond that will not be accessible via this approach.
The third approach uses new values to pull elements out of the tuple, as in the following:
let me = ("Ted", "Neward", 38, "Redmond", "WA") let (firstName, lastName, age, city, state) = me System.Console.WriteLine("Hello, {0} {1}", firstName, lastName)
This is actually a short form of pattern-matching (discussed in Chapter 6) so the use of the wildcard (_
) to ignore certain elements of the tuple also works:
let me = ("Ted", "Neward", 38, "Redmond", "WA") let (firstName, _, _, city, _) = me System.Console.WriteLine("Hello, {0}, how's {1}", firstName, city)
Note that any sort of name-binding can be used, so if we have an array (see the next section for more on arrays) of tuples that we need to extract values from individually, we can do so using for
again:
let people = [| ("Ted", "Neward", 38, "Redmond", "WA") ("Katie", "Ellison", 30, "Seattle", "WA") ("Mark", "Richards", 45, "Boston", "MA") ("Rachel", "Reese", 27, "Phoenix", "AZ") ("Ken", "Sipe", 43, "St Louis", "MO") ("Naomi", "Wilson", 35, "Seattle", "WA") |] for (firstName, lastName, _, _, _) in people do System.Console.WriteLine("{0} {1}", firstName, lastName)
Where other .NET languages tend to use custom value types (those types that inherit from System.ValueType
, also known as struct
s in C# or as Structure
s in Visual Basic), F# encourages the use of tuples instead. Note that it is still possible to create value types in F# (see Chapter 7 for details), but that should be reserved for the specific case where a new kind of "primitive type" needs to be created, rather than simply creating a "bundle of values," which is the province of the tuple type.
Arrays, as most .NET developers know, are homogenous collections that are laid out sequentially in memory, providing fast random-access capabilities. Because arrays of objects are just arrays of mutable references to objects, however, arrays are discouraged as constructs for use in functional programming. As a fully vested member of the CLR platform, the F# language provides a complete set of functionality to arrays that is similar to that for lists and other collection types, but aside from interoperability with the underlying CLR platform and assemblies written in other languages, F# code "in the wild" rarely uses it. For new F# programmers, the general recommendation is to reach for a list (seen next) rather than an array.
The simplest array to understand is the empty array, denoted by an empty pair of array brackets:
let emptyArray = [| |]
Arrays can also be initialized with contents by semicolon-separating the values, as in:
let arrayOfIntegers = [| 1; 2; 3; 4; |]
or by separating the contents onto their own line:
let arrayOfStrings = [| "Fred" "Wilma" "Barney" "Betty" |]
Either style works equally well, leaving the choice to be based primarily on which one reads more clearly to the developers involved.
If the array is to contain all the same value, the create
function from the Array
module can be used to construct an array:
let arrayOfZeroes = Array.create 10 0 let arrayOfTeds = Array.create 10 "Ted"
The Array.create
function takes an initial size and an initial value as parameters and constructs the array accordingly.
Arrays can also be initialized by range expressions, in which a starting and ending value are provided to the language, and it infers the rest, initializing the list with starting and ending values and the contents in between:
let arrayOfIntegers = [| 1 .. 10 |] // [ 1; 2; 3; ... 10; ]
The range expression can also have a "step" to it, an increment value that will be added to the starting value repeatedly so long as the value produced is smaller or equal to the ending value:
let arrayOfEvenIntegers = [| 0 .. 2 .. 10 |] // [ 0; 2; 4; ... 10; ]
The step can be any numeric step, and the range expression can also support floating-point values:
let arrayOfFloatsToTen = [ 0.0 .. 0.5 .. 10.0 ]
Note that as of this release, F# will continue to support floating-point range expressions, but a warning will be generated because the language designers reserve the right to remove that functionality in a later release.
For those cases where the "step" form of a range expression isn't quite powerful enough to create the array desired, such as creating an array of the squares of 1 through 10 (1, 4, 9, 16, and so on), you could construct an array using the Array.create
function and then fill the values with the desired squares using a looping construct:
let mutableArray = Array.create 10 0 for i = 0 to 9 do mutableArray.[i] <- i*i
Note that in F#, array indexing begins at 0, as God intended.
Although this is a "traditional" way to accomplish this, F# provides a better way, by using the looping construct directly inside of the array initializer:
let arrayBuiltList = [ for i in 1 .. 10 -> i * i ]
In this particular case, we use a second form of the "for" construct, a sequence expression, described later in this chapter in the section on sequences. The net result is that the loop is effectively expanded "inside" the initializer, each result creating a new value initialized into the array. In this case, the result is an array of int
, 10 cells in size, containing 1, 4, 9, 16, and so on.
The F# syntax for describing the type of the array is, as with Option
, to append "array" to the end of the type descriptor so that an array of integers in F# is described as int array
. This can be important in certain scenarios, where the array type must be explicitly described, such as when constructing an array of objects:
let (arrayOfObjects : obj array) = [| (1 :> obj) ("two" :> obj) (3.0 :> obj) |]
In this case, not only does the array itself need to be explicitly described as an obj array
(where obj
is the synonym for System
.Object
in F#), but the individual elements must also be explicitly upcast as obj
types.
Accessing the members of the array, as shown earlier in the discussion of looping through the list to initialize its members, is done through the [] method on the array:
let people = [| ("Ted", "Neward", 38, "Redmond", "WA") ("Mark", "Richards", 45, "Boston", "MA") ("Rachel", "Appel", 27, "Pittsburgh", "PA") ("Neal", "Ford", 43, "Atlanta", "GA") ("Naomi", "Wilson", 35, "Seattle", "WA") |] let thirdPerson = people.[2]
Note that the "dot" in the usage here is required — the F# language wants to be consistent, and so treats the [] as a method invocation, to keep it in line with the other methods, such as those inherited from System.Array
, that are accessible. (Fortunately, the F# compiler will "do the right thing" with this expression, turning it into a single-opcode expression when emitting the bytecode, so no performance is lost by doing so.)
Modifying the contents of the array is as easy as putting the access expression on the left side of the assignment operator:
// Happy Birthday, Mark! people.[2] <- ("Mark", "Richards", 46, "Boston", "MA")
As with any use of arrays on the .NET platform, this modifies the array by storing a new reference to an object and has no effect on the object originally referenced.
The Array
module provides a large number of methods that can operate on arrays and provides a much richer set of functionality for arrays than that provided by the BCL. Using them typically requires nothing more than to provide the array instance on which to operate and any parameters needed by the operation (such as the function to apply to each of the array elements). For example, to iterate over an array and display its contents, the C# or Visual Basic programmer may be tempted to write the iteration loop manually:
let array = Array.create 10 0 for i = 0 to array.Length - 1 do System.Console.WriteLine(array.[i])
Doing this requires the developer to extract the element out of the array using the integer index and can be awkward in certain situations. Like C# and Visual Basic, F# permits a form of for
to handle the details of iteration internally and simply provide the element to the body of the loop for processing:
for p in array do System.Console.WriteLine(p)
However, functional programmers find this more easily done by passing a (typically anonymous) function containing the "operation" code (the body of the loop, in this case) directly to the Array.iter
function, which iterates over the array, extracts the current element, and passes it in to the "operation" function for processing:
Array.iter (fun it -> System.Console.WriteLine(it.ToString())) array
It may seem odd to the traditional O-O developer to do this at first, but doing so actually creates more opportunities for reusability and easier extension. To understand why, imagine that the array is a large one, consisting of thousands of elements. To iterate over each in a serial fashion is a huge performance hit, particularly when each element is being processed independently and therefore could be processed on its own thread (and depending on the underlying hardware, CPU core). But to write the code that spins up threads (or borrows them from the system thread pool) can be awkward to write and hard to debug, and certainly won't be something we want to write twice.
Fortunately, if the parallel-iteration code is written once and placed into a function inside a module (calling it ParallelArray
, perhaps), we can modify the previous example to take advantage of it, yielding a (fictitious) example of:
ParallelArray.iter (fun it -> System.Console.WriteLine(it.ToString())) array
This is a powerful form of reusability and is supported in .NET 4 using Array
.Parallel
.
Some of the functions offered by the Array module are given next, grouped loosely by category.
These functions (in a general sort of way) operate on the array itself, rather than individual elements within the array, to do things such as create a new array, concatenate two arrays together, and so on. These are documented in the F# documentation, so look there for more details.
These functions operate on the contents of the array, typically by taking a function as a parameter and using it against each of the elements in the array to produce a result, either a single value or another array.
| Applies |
| Returns true if any of the elements in |
| Returns an array consisting of the elements in |
| Returns the first element in |
| Returns the index of the first element in |
| Applies |
| Applies |
| Returns true if all of the elements in |
| Applies |
| Applies |
| Applies |
| Applies |
| Applies |
| Applies |
| Returns the first element in |
| Returns the index of the first element in |
| Returns the index of the first element in |
There are some additional functions in the Array
module, but many of these are just additionally varied versions of the these functions (iter2
, for example, is just a form of iter
that can operate on two arrays simultaneously), or "bake in" the function to apply to each of the elements (such as the max and min functions, which do as their names imply).
Lists are an ordered collection of a single type and, like tuples, form a major backbone of functional programming. Internally, lists are singly linked lists, meaning that list construction and extraction is extremely fast and space-efficient. However, random access to elements in the list will be far slower than arrays; fortunately, most functional programming prefers recursion over iteration, and recursion works well to extract elements one at a time out of the list for processing.
The simplest list to understand is the empty list, denoted by an empty pair of brackets:
let emptyList = []
Lists that have something inside of them are vastly more interesting, however, and can be initialized by including values (again, all of the same type) in between the brackets, separated either by semicolons or carriage-returns:
let listOfIntegers = [ 1; 2; 3; 4; ] let listOfStrings = [ "Fred" "Wilma" "Barney" "Betty" ]
As with arrays, either style works equally well — although the former is preferred for short or easily initialized lists, the latter is useful when each element in the list takes up more space, either because it needs to be accompanied by a comment describing its contents, or for scenarios where each element's initialization takes up more than a few characters:
let listOfStrings = [ "Fred" // Flintstone "Wilma" // Flintstone "Barney" // Rubble "Betty" // Rubble ] let listOfPeopleTuples = [ ("Ted", "Neward", 38, "Redmond", "WA") ("Katie", "Ellison", 30, "Seattle", "WA") ("Mark", "Richards", 45, "Boston", "MA") ("Rachel", "Reese", 27, "Phoenix", "AZ") ("Ken", "Sipe", 43, "St Louis", "MO") ("Naomi", "Wilson", 35, "Seattle", "WA") ]
Like arrays, lists can also be initialized by range expressions:
let listOfIntegersToTen = [ 1 .. 10 ] // [ 1; 2; 3; ... 10; ] let listOfEvenIntsToTen = [ 0 .. 2 .. 10 ] // [ 0; 2; 4; ... 10; ] let listOfFloatsToTen = [ 0.0 .. 0.5 .. 10.0 ]
It is also possible to take an existing list and prepend new elements to it using the :: (pronounced "cons") operator, which takes an element on the left side and a list on the right side and produces a new list out of the results:
let consedList = 1 :: 2 :: 3 :: 4 :: []
Two things are important to note about consedList
: First, notice that the right-most element is the empty list. This is because the cons
operator must have a list as its right-most argument, or the operation will fail. This also leads to the second point, which is that the cons
operator is right-associative, meaning that the preceding code, written in a more explicit (and tedious) form, looks like
let consedList = 1 :: (2 :: (3 :: (4 :: [])))
which illustrates that in each case, the right item to the operator will be a list. Note that this also implies that for each operation, a new list object will be created and handed back, until the final list object created will be bound to consedList
. In this respect, lists behave similar to how .NET System
.String
objects behave, for many of the same reasons.
Developers with an eye on performance-sensitive code may find that last paragraph disconcerting. The implication that each cons
operation requires the contents of the list to be copied over into a new list, only to in turn be copied again into the new list after that, and so on, smacks of a horrible waste of effort.
Fear not. Two things make this situation more palatable. First, the list is a singly linked list, meaning that each value is stored in its own node. Second, because lists are immutable, they can share nodes across lists, thus reducing the total cost of "copying" the list. (It's worth noting that this "sharing" of nodes could never be possible in a mutable list, reinforcing the intrinsic usefulness of immutable data structures, making this use of immutable lists often much faster than working with mutable lists.)
For those cases where the "step" form of a range expression isn't quite powerful enough to create the list desired, such as creating a list of the squares of 1 through 10 (1, 4, 9, 16, and so on), you could use a looping construct (from Chapter 4) to build up a list, something along the lines of the following:
let mutable forBuiltList = [] for i = 1 to 10 do forBuiltList <- (i * i) :: forBuiltList
but this feels horribly awkward, for a variety of reasons. First, because lists are immutable, concatenating against forBuiltList
will produce only a new list, not modify the original, and thus needs to be captured into a mutable local variable (using the mutable keyword, described in Chapter 8) across each step in the loop. Second, because forBuiltList is
now a mutable reference, any developer can come along and modify the contents after its initial construction, creating a potential logic hole.
Fortunately, as with arrays, the F# language permits the use of a sequence expression inside of the list initializer:
let forBuiltList = [ for i in 1 .. 10 -> i * i ]
As with the array case, the result is a list of int
containing the squares of the values 1 through 10.
If you are diligently typing in the examples in as you read this book, you might notice that the two lists aren't exactly the same. Although they're both lists of int
containing the squares of 1 through 10, the first list is a list of squares counting down from 10, and the second is a list counting up from 1. This is because in the second case the elements are automatically appended to the list, whereas in the first case they're manually prepended. This just reinforces the idea that the second syntax is the preferential one to use, since it will more likely be what the developer really intended.
For those situations where a developer wants to smash two lists together, F# provides the @ (concatenation) operator, which takes two lists and produces a new list out of the combined contents:
let concattedList = listOfIntegers @ consedList // [ 1; 2; 3; 4; 1; 2; 3; 4; ]
In each of these cases, the resulting list is a list of a single type, whether int, string
or the tuple type (string * string * int * string * string)
.
Because lists are so common in F# code, the language provides a particular type syntax for describing lists, that of appending the list element type with the suffix "list." Thus, the respective types of the lists viewed so far would be described by F# as:
listOfIntegers : int list listOfStrings : string list listOfPeopleTuples : (string*string*int*string*string) list listOfFloatsToTen : float list consedList : int list forBuiltList : int list concattedList : int list
In certain scenarios, it will be helpful (or necessary) to create a list that contains more than one type, but trying the naïve approach fails miserably:
let notWorkingList = [ 1; "2"; 3.0; ]
This is because the compiler looks at the first element of the list, an int
, and assumes that the rest of the list should also be int
values, which the successive two values clearly are not.
For those situations demanding a "list of everything," the F# language allows for an "object list," though the syntax will often require a type descriptor to force the compiler to see it as such:
let (objectList : obj list) = [ (1 :> obj) ("2" :> obj) (3.0 :> obj) ]
The "upcast" operators are required here for the F# compiler to see them in their obj
form (instead of assuming them, as it would naturally, to be literals of their declared type — int, string
, and float
, respectively), and the parentheses around the upcast is necessary to see it all as one expression. (The awkwardness here is arguably a deliberate decision, because F# would much rather programmers figure out more strongly typed ways of interacting with lists.)
When initialized, elements of a list can be accessed using a variety of approaches.
The first approach is to use the Head and Tail properties on the list instance itself, which returns the first object in the list and the remainder of the list (as a list), respectively:
let people = [ ("Ted", "Neward", 38) ("Mark", "Richards", 45)
("Naomi", "Wilson", 38) ("Ken", "Sipe", 43) ] let peopleHead = people.Head System.Console.WriteLine(peopleHead)
In the case of a single-element list, Head will return that element, and Tail will return an empty list. Unlike the array, however, the list provides no efficient random-access operation — lists can only be accessed as head and tail elements. However, should the .NET developer desire access to the nth item of a list, it can be obtained via the "Item" indexer property defined on the List<>
class. However, doing so has two drawbacks. One, because the list is a singly linked list, accessing elements further from the head of the list will be increasingly slower. Two, doing so will potentially create confusion in those who read the code, because functional languages have not traditionally had arbitrary access to elements of the list. As a result, it's best to consider this method and its performance implications carefully before using.
The Head
and Tail
properties provide effectively the same result as the List
module methods [7] head
and tail
, respectively:
let people = [ ("Ted", "Neward", 38) ("Mark", "Richards", 45) ("Naomi", "Wilson", 38) ("Ken", "Sipe", 43) ] let firstPerson = List.head people System.Console.WriteLine(firstPerson)
Which of these two styles F# programmers should use depends somewhat on programmer aesthetics and comfort. Having said that, however, the programmer seeking to understand and master functional style will prefer the use of the List
methods because those can be partially applied; see Chapter 13 for more details.
Like arrays, lists can also be extracted through the use of variable bindings, but this is less common and potentially dangerous; for example, the following code will compile and run, but generates a warning that not all possible matches are accounted for. (This is discussed further in Chapter 6.)
let people = [ ("Ted", "Neward", 38) ("Mark", "Richards", 45) ("Naomi", "Wilson", 38) ("Ken", "Sipe", 43) ] let (personOne :: rest) = people System.Console.WriteLine(personOne)
As a result, this form is less used, in favor of recursively using Head/List.head
and Tail/List.tail
instead.
Lists can also provide access to the n-th item in the list using the .[]
method (the Item
property at the IL level) or the List
.nth
function, but this is rarely done in F# code, because each such access will require traversing the linked list from the beginning until the n-th item is found (O(n) performance). Given how trivially easy it is to convert between arrays and lists, using the toArray
or toList
functions found on each, in general it will be preferable to convert the list to an array to do random access.
The List
module (see Chapter 11 for details on modules and namespaces) provides a large number of functions
for use with lists, even more than those provided for Array
. As with the Array
module functions, the List
module functions frequently take a function as a parameter to apply to the various elements within the List
, and many of the same functions appear in both modules under the same names and declarations so as to maintain consistency.
Some of the List
functions appear next, loosely grouped by category.
These functions, like those listed for the Array
module, manipulate lists rather than the contents within them. In general, these functions create or copy lists, or do something to the list as a whole, in much the same way that the functions described in the Array
section do. There is a great deal of duplication here, so that the F# programmer familiar with how an operation works on an array already knows how an operation will work on a list, and vice versa.
These functions operate on the contents of the list, typically by taking a function as a parameter and invoking it once for each of the elements, passing that element in as a parameter (also known as "applying" the function to each of the elements), to produce a result, either a single value or with all the results grouped into another list.
| Applies |
| Returns true if any of the elements in |
| Returns a list consisting of the elements in |
| Returns the first element in |
| Returns the index of the first element in |
| Applies |
| Applies |
| Applies |
| Returns true if all of the elements in |
| Applies |
| Applies |
| Applies |
| Applies |
| Applies |
| Applies |
| Applies |
| Returns the first element in |
| Returns the index of the first element in |
There are some additional functions in the List
module, but many of these are just additionally varied versions of these functions (iter2, for example, is just a form of iter
that can operate on two lists simultaneously), or "bake in" the function to apply to each of the elements (such as the average function, which calculates the average of an array whose elements support addition and division operations).
Because using the functions described in the List
and Array
modules can be confusing at first to the traditional object-oriented programmer, a few examples may help clear up their use.
In C# and Visual Basic, when looking through an array (or list) of Person
objects for a Person
whose last name is "Neward", most programmers write a for-each
loop, looking through at the LastName
property of the Person
object to see if it matches the criteria in question. To do this in F#, assuming the existence of a class type called Person
was already defined, you would write instead:
let people = [| new Person("Ted", "Neward", 38) new Person("Mark", "Richards", 45) new Person("Ken", "Sipe", 43) new Person("Naomi", "Wilson", 38) new Person("Michael", "Neward", 16) new Person("Matthew", "Neward", 9) |] let newardsFound = Array.find (fun (it : Person) -> it.LastName = "Neward") people System.Console.WriteLine(newardsFound)
If, instead, the goal was to know all the people who were of drinking age, however, the find operation would want all the Person
s whose age was greater or equal than 21. Because this could be any number of Person
s, and all of them need to be returned, this is better returned as another array:
let drinkers = Array.filter (fun (it : Person) -> it.Age > 21) people
Of course, everybody who's over 21 deserves a beer, and that's something that needs to be applied to each element in the array:
Array.iter (fun (it : Person) -> System.Console.WriteLine("Have a beer {0}!", it.FirstName)) drinkers
The real power of this functional style becomes apparent when used with the pipeline operator (discussed in Chapter 13) to string each of these operations together:
people |< Array.filter (fun (it : Person) -> it.Age > 21) |< Array.iter (fun (it : Person) -> System.Console.WriteLine("Have a beer, {0}!", it.FirstName))
When the filtered and iterated functions are named, the code becomes almost a natural language:
people |> Array.filter isADrinker |> Array.iter haveABeer
Or the functions involved in the filter
and iter
operations can be named to make it look even more readable:
let isADrinker (ar : Person array) = Array.filter (fun (p : Person) -> p.Age > 21) ar let haveABeer (ar : Person array) = Array.iter (fun (p : Person) -> System.Console.WriteLine("Have a beer, {0}!", p.FirstName) ) ar people |> isADrinker |> haveABeer
Of course, all these operations are equally applicable to either arrays or lists, simply by either converting the type descriptors to use Person
list instead of Person
array, or in some cases by omitting the type descriptor entirely and allowing F# to infer the types.
F#, like most functional languages, is without peer when working with collections this way. Most C# and VB developers, after having used LINQ for a while, discover that many of the exact same concepts built into LINQ are here in F# (and other functional languages). Even more deeply, if the budding F# programmer wants to spend a few mind-blowing moments, they should go back to their old SQL code, replace the columns with tuples and the tables with lists of tuples, and see how quickly list access and manipulation in F# using functions feels almost identical to the canonical SQL operations (SELECT, INSERT, UPDATE, DELETE
).
More on using F# against a relational database is given in Chapter 19.
A sequence, according to the formal definition of the F# language, is simply a synonym for the IEnumerable
type defined in the .NET platform. As such, anytime an F# function produces an iterator across a collection, it is typically a sequence type. However, because functional languages have a rich history of using generators (functions that produce values, including functions that appear to be infinitely large or lazily computed) as the source of values to other functions, F# has a much wider suite of functions for sequence types (seq
) than the .NET developer might expect.
By default, the sequence type is just a producer of values; in other words, when a sequence is obtained, the sequence itself has no sense of what values it holds and will hand back — instead, it has code internally that knows how to produce the next-expected value. For this reason, sequences are frequently thought of as lazy, meaning they do not initialize to contain all the values they will eventually produce.[8]
By far, the easiest way to obtain a sequence is to create one, using the seq
keyword and a block of code that yield
s a result each time the sequence is asked to produce one, as in:
let x = seq { for i = 1 to 10 do yield i }
It's important to note that the sequence will not have yet produced any values — x
is simply a generator of values, and each time it is asked to generate the value, it will execute the next branch of the for
construct. This means that the loop inside of the sequence isn't really a loop at all, but a sequence of executable statements — contrary to what might seem obvious, there isn't a collection of 10 integers waiting to be handed back.
This becomes more obvious when we put some obvious side effect into the sequence loop, such as:
let y = seq { for i = 1 to 10 do System.Console.WriteLine("Generating {0}", i) yield i }
When run, no such "Generating" lines appear on the screen, because as of this point, the sequence has only been created, not asked to produce a value. The Console.WriteLine
action won't take place until the sequence is asked to produce a value, such as when we convert the sequence into an IEnumerable
and ask it for its current value:
let y = seq { for i = 1 to 10 do System.Console.WriteLine("Generating {0}", i) yield i } let yEnum = y.GetEnumerator() if yEnum.MoveNext() then System.Console.WriteLine(yEnum.Current)
This will produce "Generating 1" to the screen as soon as MoveNext()
is called, because it is at that point that the sequence is asked to produce its first value. No other value is generated, because only that particular value is needed — any future values will be waiting to be generated on demand.
Note that sequences can be "reset" at any time by obtaining a new IEnumerator
from the GetEnumerator()
method on the sequence, or by simply using the sequence in a different Seq
module call. It is important to realize, however, that it is the IEnumerator
that holds the "current state" of the sequence and not the sequence itself.
This means that F# can permit what other languages would consider to be "impossible" sequences, such as a sequence that never terminates. Imagine for a moment that we have a program that needs to simulate rolling three six-sided dice several times.[9] This can be approximated by an infinite sequence generating random numbers between 3 and 18, inclusive. We can generalize the function that creates it to generate random numbers between any minimum and maximum values:
let randomNumberGenerator minVal maxVal = let randomer = new System.Random() seq { while (true) do yield (randomer.Next(maxVal - minVal) + minVal) } let diceRolls = (randomNumberGenerator 3 18) |> Seq.take 6 Seq.iter (fun (roll : int) -> System.Console.WriteLine("You rolled a " + roll.ToString())) diceRolls
(See Chapter 13 for details on writing functions.) This code uses the Seq.take
function to obtain the first six values from the sequence, which should range between the values 3 and 18 and then takes that resulting sequence of six "die rolls" and hands that into the Seq.iter
function for printing each roll to the console.
In Chapter 4, we saw that for loops could be initialized with range expressions that made it fairly easy to iterate through a collection of numbers in a sequential stepped format. It turns out that the range expression is a kind of sequence expression so that the expression seq {0 .. 2};
produces a sequence consisting of the values 0, 1, and 2. However — the difference between this expression and the list or array examples we saw earlier is that, as discussed already, the sequence has not yet produced those values but will do so on demand.
As a side note, this behavior of the range expression is actually open to any user-defined type, not just numeric values. Any type that defines member methods (..)
and (.. ..)
, as discussed in the section on defining operator methods in Chapter 8, can be used in a range expression.
Combining sequences with a "for comprehension" is a trivial and natural thing to do, as in:
let squares = seq { for i in 0 .. 10 -> (i, i*i) }
This produces a sequence of int * int
tuples consisting of the numbers 0 through 10 and their squares; but again, the lazy nature of the sequence means that the body of the for
loop has not yet been fired but is waiting to be asked for the first value.
Sequences have broad application beyond just the computation of numbers, particularly anywhere a stream of data instances from outside the program itself is the principal data item. For example, developers frequently iterate through a directory or series of directories looking for files of a particular type for processing. If we consider the file itself to be the data item in question, then the directory or set of directories becomes the scaffolding for the sequence:
let dir d = let di = new System.IO.DirectoryInfo(d) seq { for fi in di.GetFileSystemInfos() -> fi }
This makes it easy to get all the files in the root directory on the system:
let rootFiles = dir "C:\"
Now, of course, given the sequence, it becomes trivial to walk through those files and display their names:
let printFileInfo (fi : System.IO.FileSystemInfo) = System.Console.WriteLine("{0}", fi.FullName) for fi in rootFiles do printFileInfo fi
But the real power of the sequence again comes in its laziness; normally, trying to build a list of all the files on the system requires the program to do a harsh bit of I/O gathering up all the data required, but if a sequence is generated instead of an ordinary list, no disk I/O is performed until the developer starts to look through the sequence:
let rec recursiveDir d = let di = new System.IO.DirectoryInfo(d) seq { for f in di.GetFiles() do yield f for sd in di.GetDirectories() do yield! recursiveDir sd.FullName } let allFiles = recursiveDir @"C:" for fi in allFiles do printFileInfo fi
A couple things are happening here simultaneously: One, the sequence is being built from two sources, rather than the single source expression that's been demonstrated so far, and two, rather than using the yield
keyword to return a single instance back to the sequence, we use the yield
! keyword to yield back a sequence into the sequence, essentially appending the results of that (recursively obtained) sequence to the one being generated. This second sequence, the one generating an additional sequence, must be the last expression in the sequence block. Any number of item-generating expressions can be used prior to that, however.
Like the array and list types, the sequence has a large number of functions defined in the Seq module and can be thought of in a few loosely grouped categories which are described next.
These functions, like those listed for the Array
module, manipulate lists rather than the contents within them. In general, these functions create or copy sequences, or do something to the sequence as a whole, in much the same way that the functions described in the Array
section do. There is a great deal of duplication here so that the F# programmer familiar with how an operation works on an array already knows how an operation will work on a sequence, and vice versa.
| Creates a new sequence containing the elements of |
| Creates a cached sequence with the same values in |
| Creates a sequence out of another sequence of the same type |
| Creates a single sequence consisting of all the elements in the sequence of sequences |
| Returns a sequence that is built by |
| Creates a sequence consisting of all the unique elements of the given sequence |
| Returns an empty list of the given type |
| Returns the head (first element) of |
| Creates a sequence that uses |
| Returns true if |
| Returns the length of |
| Returns the |
| Converts the array |
| Converts the list |
| Creates a new sequence that delegates to the passed sequence |
| Creates a sequence consisting of the single object |
| Creates a sequence out of the items of sequence |
| Creates a sequence out of the items of sequence |
| Creates a sequence out of the first |
| Creates a sequence out of the items of the sequence seq only if |
| Creates an array from the sequence |
| Creates a list from the sequence |
| Returns a sequence consisting of no more than n items from the sequence seq |
| Returns a sequence of "sliding windows" of the sequence |
These functions operate on the contents of the sequence, typically by taking a function as a parameter and invoking it once for each of the elements, passing that element in as a parameter (also known as "applying" the function to each of the elements), to produce a result, either a single value or with all the results grouped into another sequence.
| Applies |
| Compares each element of |
| Returns true if any of the elements in |
| Returns a sequence consisting of the elements in seq that, when passed in to |
| Returns the first element in seq that returns true when passed to |
| Returns the index of the first element in seq that returns true when passed to |
| Applies |
| Returns true if all of the elements in |
| Applies |
| Applies |
| Applies |
| Applies |
| Applies |
| Returns the first element in |
| Returns the index of the first element in seq for which |
There are some additional functions in the Seq
module, but many of these are just additionally varied versions of the preceding functions (iter2
, for example, is just a form of iter
that can operate on two sequences simultaneously), or "bake in" the function to apply to each of the elements (such as the average function, which calculates the average of a sequence whose elements support addition and division operations).
Maps, or dictionaries as the .NET FCL tends to call them, are collections of object-to-object pairings, most often used as name-value or key-value pairs, where the key is typically a string and the value is any particular type. Although not officially supported as an F# language construct, maps are commonly enough used that we can think of them as such, and have a number of interesting library supporting functions that make them almost trivial to use.
Creating a new map is relatively easy. If we visualize a map as a list of pairs, then the Map
.ofList
function makes it easy to transform a list of two-element tuples into a Map
:
let nicknames = Map.ofList [ "Ted",new Person("Ted", "Neward", 38); "Katie",new Person("Katie", "Ellison", 30); "Mike",new Person("Michael", "Neward", 16) ]
Officially, the type returned by this function is a Map<string, string>
, and the returned object is fully compatible with the rest of the .NET FCL — it implements the IDictionary<>
interface and the ICollection<>
interface yet is still "F#"-ish, in that this is also a sequence of System.Collections.Generic.KeyValuePair<string,string>
items.
Maps can also be constructed from arrays (using Map.ofArray
) or sequences (using Map.ofSeq
); the syntax is almost identical to the preceding code, using sequence or array notation as appropriate.
Like arrays and lists, maps are type-safe, type-parameterized constructs that refuse to accept something (as either key or value) that is not type-compatible, so a Map<string, string>
will not accept anything other than a string for the key or value. Also, like lists, when constructed, maps are immutable, so any "modification" operation (Add
or Remove
) on the map doesn't modify the contents of the map but returns a new map with the modification in it:
let moreNicknames = nicknames.Add("Mark", new Person("William","Richards",45))
An IDictionary<>
instance can also be constructed explicitly from a sequence of two-element tuples using the dict function:
let numberMappings = dict [ (1, "One"); (2, "Two"); (3, "Three") ]
However, despite the surface similarities, dict
doesn't return an F# map, per se — the returned object doesn't quite implement all the same APIs as the returned object from Map.ofList
, and as a result, the dict-returned object won't be acceptable as a parameter to the various Map
functions (listed next). This also has other implications; for example, calling Add()
on the dict
-returned object modifies the internal collection rather than returning a new one.
For the most part, prefer to use the Map
functions to create a map when writing F# code, and use dict
to create .NET FCL-compatible Dictionary<>
objects for easier interoperability with the rest of the .NET ecosystem. (See Chapter 18 for details on F#/.NET interoperability.)
Accessing the values in a map can be done in one of two ways, and just as with the list and array types, one of them is more .NET-like, and the other is more F#-like. Both rely conceptually on using the "key" of the pair to find the "value" of the pair.
Accessing values of the map in the .NET style involves using the built-in Item
property in the traditional manner, as if the collection were an array taking the key type as a parameter:
let ted = nicknames.["Ted"] System.Console.WriteLine(ted)
The other approach, preferred by functional programmers, involves the find
function defined on the Map
module:
let ted = Map.find "Ted" nicknames System.Console.WriteLine(ted)
The reason for the two different approaches will become more clear in Part 3, "Functional Programming," when functions and function composition is introduced and explained; for now, just recognize that either style yields exactly the same result, the Person
object whose value matches that of the string key passed in, or an exception if the key isn't found in the map:
try let noone = nicknames.["Katie"] System.Console.WriteLine(noone) with | ex -> System.Console.WriteLine("Katie not found") try let noone = Map.find "Katie" nicknames System.Console.WriteLine(noone) with | ex -> System.Console.WriteLine("Katie not found")
Exception-handling is discussed in Chapter 4.
After having read the section on the Option
type, it may seem that F# should support some kind of lookup operation that returns an Option
instance, either Some
(value) if found or None
if the key isn't present, rather than throw an exception in the case of failure. Said readers would be correct; again, two different forms of lookup-without-exception are available, one a more FCL-ish style, the other a more F#-ish style:
let notfound = nicknames.TryFind("Katie") System.Console.WriteLine( if notfound = None then "Not found" else notfound.Value.ToString()
) let notfound = Map.tryFind "Katie" nicknames System.Console.WriteLine( if notfound = None then "Not found" else notfound.Value.ToString() )
The more idiomatic way to write this would be to use a pattern-match on the returned Option, as discussed in Chapter 6.
Additionally, if you want to test to see if the key is in the map rather than return the value corresponding to the key, you can use either the ContainsKey()
instance method or the Map.tryFindKey
function.
The Map
module has a number of functions that can operate on maps, as shown here:
| Adds the pair ( |
| Returns true if |
| Returns true if there is at least one key/value pair in |
| Returns a new map containing the bindings for which |
| Finds the element |
| Finds the key for which |
| Executes |
| Similar to |
| Returns true if |
| Returns true if |
| Executes |
| Creates a new map by executing |
| Returns a new map made up of the elements in |
| Returns a new map made up of the elements in |
| Returns a new map made up of the elements in |
| Builds two maps (returned as a tuple), the first consisting of those key/value bindings which return true from |
| Returns the first element in |
| Returns a map without the pair given by the key |
| Converts |
| Converts |
| Converts |
| Like find, but without the exception, returning an option ( |
| Like |
| Like pick, but without the exception, returning an option ( |
Most of these are self-explanatory, particularly when compared to the similarly named methods of List
and Array
and after higher-order functions are introduced in Part 3.
Sets are another example of a type in F# that isn't implemented as a built-in type, yet has syntactic support enough to make it "feel" as it if were built-in, like maps. Fundamentally, a set is a strongly typed collection of objects that enforces a "no duplicates" principle:
let setOfPeople = Set.ofList [ new Person("Ted", "Neward", 38); new Person("Ted", "Neward", 38); new Person("Ted", "Neward", 38); ] for p in setOfPeople do System.Console.WriteLine(p)
When run, only one object will display, the other two having been determined to be duplicates and therefore discarded.
The object that is returned is a Set<>
, in this particular case, a Set<Person>
, and like the preceding map, implements many of the .NET FCL types that .NET developers would expect for a collection type: IComparable, ICollection
, and IEnumerable
to be precise. (See Chapter 9 for details on interface inheritance in F#.) It also implements the interface corresponding to sequences, so any place a sequence is expected, a correspondingly strongly typed set can be passed in its place.
Like the map, sets can also be constructed using the set function that takes a list and converts it to a set:
let setOfNicknames = set [ "Ted"; "Theo"; "Tio"; "Ted"; "Ted"; "Teddy" ] for p in setOfNicknames do System.Console.WriteLine(p)
In the case of set, however, the returned object is the exact same type as that returned from Set.ofList
(or Set.ofArray
or Set.ofSeq
, used to construct sets from arrays or sequences, respectively), so there is no practical difference between the two.
One thing that differentiates sets from other collections is that types that are placed into a set must be comparable somehow so that the set can determine if the object instance is already present inside the set. In F#, this means the type in question must implement the IComparable
interface (as described in Chapter 9), and as a result any type which doesn't support IComparable
, such as the base type System.Object
, won't go into a set; attempts to do so will generate an error, either at compile-time or at runtime.
Given that objects in a set are comparable, however, the set can provide some interesting additional functionality, such as finding the maximum and minimum value of all the items inside the set. These are available either as instance methods on the Set<> object, or as library methods from the Set module.
Again like its cousin the map, the set is an immutable collection, meaning that any attempt to modify the set will result in a new set being created and returned. More interestingly, however, beyond the simple add and remove from the set, the set also supports various "set theory" operations, such as intersection, union, difference, and testing for superset and subset comparison operations, all of which are possible because of the "comparison requirement" that sets impose on their contents.
In general, sets are good for domain objects, because domain objects frequently want or need to be unique within the domain, and the set operations can reduce the code developers need write.
F# provides a number of powerful composite types for the F# programmer, above and beyond those provided by the underlying .NET BCL. Lists are first-class citizens and the preferred way to collect a homogenous group of values together, but arrays are fully supported as well. Option types provide a safer means of dealing with the "no value" possibility, and tuple types provide a stronger method of working with a tightly grouped set of values without going to the trouble of creating a named, distinct type. Sequences are generic streams of objects and can either be a concrete finite set of data or lazily generated values produced on demand. Maps and sets are collection types that have some useful library support that allow them to look like language-supported built-in types and stand as an example of what additional functionality can be layered into F# without having to change the language itself, in addition to being useful entities in their own right.
[5] Technically, this isn't true — the None value is, in truth, null under the covers (for optimization reasons), but many of the F# APIs are written specifically to handle the null-ness silently. This lends the idea that None
is a "real" object value.
[6] In truth, this isn't a precise definition of the term, but it suffices for the discussion here.
[7] Note that in previous releases of F#, these were known as hd
and tl
, and lots of F# code and samples still use them as such.
[8] This is not quite what functional programmers think of when they use the term lazy, but it is a related concept and will feel pretty lazy to programmers not used to the functional lingo.
[9] Yes, I'm looking at all of you who used to (or still do) play Dungeons & Dragons.