CHAPTER 6

Organizing, Annotating, and
Quoting Code

An important part of any programming language is the ability to organize code into logical chunks. It's also important to be able to annotate code with notes about what it does, for future maintainers and even yourself.

It has also become common to use attributes and data structures to annotate assemblies and the types and values within them. Other libraries or the CLR can then interpret these attributes. I cover this technique of marking functions and values with attributes in the section "Attributes." The technique of compiling code into data structures is known as quoting, and I cover it in the section "Quoted Code" toward the end of the chapter.

Modules

F# code is organized into modules, which are basically a way of grouping values and types under a common name. This organization has an effect on the scope of identifiers. Inside a module, identifiers can reference each other freely. To reference identifiers outside a module, the identifier must be qualified with the module name, unless the module is explicitly opened with the open directive (see "Opening Namespaces and Modules" later in this chapter).

By default, each module is contained in a single source file. The entire contents of the source file make up the module, and if the module isn't explicitly named, it gets the name of its source file, with the first letter capitalized. (F# is case sensitive, so it's important to remember this.) It's recommended that such anonymous modules be used only for very simple programs.

To explicitly name a module, use the keyword module. The keyword has two modes of operation. One gives the same name to the whole of the source file. The other mode gives a name to a section of a source file; this way several modules can appear in a source file.

To include the entire contents of a source file in the same explicitly named module, you must place the module keyword at the top of the source file. A module name can contain dots, and these separate the name into parts. For example:

module Strangelights.Foundations.ModuleDemo

You can define nested modules within the same source file. Nested module names cannot contain dots. After the nested module's name comes an equals sign followed by the indented module definition. You can also use the keywords begin and end. To wrap the module definition, you can nest submodules. The following code defines three submodules, FirstModule, SecondModule, and ThirdModule. ThirdModule is nested within SecondModule.

#light
module FirstModule =
    let n = 1

module SecondModule =
    let n = 2
    module ThirdModule =
        let n = 3

Note that different submodules can contain the same identifiers without any problems. Modules affect the scope of an identifier. To access an identifier outside of its module, you need to qualify it with the module name so there is no ambiguity between identifiers in different modules. In the previous example, the identifier n is defined in all three modules. The following example shows how to access the identifier n specific to each of the modules:

let x = FirstModule.n
let y = SecondModule.n
let z = SecondModule.ThirdModule.n

A module will be compiled into a .NET class, with the values becoming methods and fields within that class. You can find more details about what an F# module looks like to other .NET programming languages in "Calling F# Libraries from C#" in Chapter 13.

Namespaces

Namespaces help organize your code hierarchically. To help keep module names unique across assemblies, the module name is qualified with a namespace name, which is just a character string with parts separated by dots. For example, F# provides a module named List, and the .NET BCL provides a class named List. There is no name conflict, because the F# module is in namespace Microsoft.FSharp, and the BCL class is in namespace System.Collections.Generic.

It's important that namespace names be unique. The most popular convention is to start namespace names with the name of a company or organization, followed by a specific name that indicates functionality. Although it's not obligatory to do this, the convention is so widely followed that if you intend to distribute your code, especially in the form of a class library, then you should adopt it too.


Note Interestingly, at the IL level, there is no real concept of namespaces. A name of a class or module is just a long identifier that might or might not contain dots. Namespaces are implemented at the compiler level. When you use an open directive, you tell the compiler to do some extra work; to qualify all your identifiers with the given name, if it needs to; and to see whether this results in a match with a value or type.


In the simplest case, you can place a module in a namespace simply by using a module name with dots in it. The module and namespace names will be the same. You can also explicitly define a namespace for a module with the namespace directive. For example, you could replace the following:

module Strangelights.Foundations.ModuleDemo

with this to get the same result:

namespace Strangelights.Foundations
module ModuleDemo

This might not be too useful for modules, but as noted in the previous section, submodules names cannot contain dots, so you use the namespace directive to place submodules within a namespace. For example:

#light
namespace Strangelights.Foundations
module FirstModule =
    let n = 1

module SecondModule =
    let n = 2
    module ThirdModule =
        let n = 3

This means that once compiled to the outside world, the first instance of n will be accessible using the identifier Strangelights.Foundation.FirstModule.n instead of just FirstModule.n.

It's possible to define a namespace without also using a module directive, but then the source file can contain only type definitions. For example:

#light
namespace Strangelights.Foundations

type MyRecord = { field : string }

The following example will not compile, because you can't place a value definition directly into a namespace without explicitly defining a module or submodule within the namespace.

#light
namespace Strangelights.Foundations

let value = "val"

In fact, the namespace directive has some interesting and subtle effects on what your code looks like to other languages; I cover this in "Calling F# Libraries from C#" in Chapter 13.

Opening Namespaces and Modules

As you have seen in the previous two sections, to specify a value or type that is not defined in the current module, you must use its qualified name. This can quickly become tedious because some qualified names can be quite long. Fortunately, F# provides the open directive so you can use simple names for types and values.

The open keyword is followed by the name of the namespace or module you want to open. For example, you could replace the following:

#light
System.Console.WriteLine("Hello world")

with this:

#light
open System

Console.WriteLine("Hello world")

Note that you don't need to specify the whole namespace name. You can specify the front part of it and use the remaining parts to qualify simple names. For example, you can specify System.Collections rather than the namespace, System.Collections.Generic, and then use Generic.List to create an instance of the generic List class, as follows:

#light
open System.Collections

let l = new Generic.Dictionary<string, int>()



Caution The technique of using partially qualified names, such as Generic.Dictionary, can make programs difficult to maintain. Either use the name and the full namespace or use just the name.


You can open F# modules, but you cannot open classes from non-F# libraries. If you open a module, it means you can reference values and types within it by just their simple names. For example, the following finds the length of an array and binds this to the identifier len:

#light
open Microsoft.FSharp.MLLib.Array

let len = length [| 1 |]

Some argue that this ability to open modules directly should be used sparingly because it can make it difficult to find where identifiers originated. In fact, modules can typically be divided into two categories: those that are designed to be accessed using qualified names and those that are designed to be opened directly. Most modules are designed to be accessed with qualified names; a few, such as Microsoft.FSharp.Core.Operators, are designed to be directly opened. The next example shows the using function from Operators being used directly:

#light
open System.IO
open Microsoft.FSharp.Core.Operators

using (File.AppendText("text.txt") ) (fun stream ->
    stream.WriteLine("Hello World"))

This example is slightly bogus because the Microsoft.FSharp.Core.Operators is opened by the compiler by default. I used this because modules designed to be opened directly are a rarity since it's almost always useful for anyone reading the code to know where the function originated.

If you open two namespaces that contain modules or classes of the same name, it won't cause a compile error. You can even use values from the modules or classes with the same name, as long as the names of the values are not the same. In Figure 6-1, the namespace System is opened. It contains the class Array, and a module, Array, is also available in F#'s libraries. In the figure you can see both static methods from BCL's Array class, all starting with a capital letter, and values from F#'s Array module, which start with a small letter.

image

Figure 6-1. Visual Studio's IntelliSense

Giving Namespaces and Modules Aliases

It can sometimes be useful to give an alias to a namespace or module to avoid naming clashes. This is useful when two modules share the same name and value with a common name and can also be a convenient way of switching some of your code to use two different implementations of similar modules. It can be more common when using libraries not written in F#.

The syntax for this is the module keyword followed by an identifier, then an equals sign, and then the name of the namespace or module to be aliased. The following example defines GColl as the alias for the namespace System.Collections.Generic:

#light
module GColl = System.Collections.Generic

let l = new GColl.List<int>()

Signature Files

Signature files are a way of making function and value definitions private to a module. You've already seen the syntax for the definition of a signature file in Chapter 2. It is the source that the compiler generates when using the -i switch. Any definitions that appear in a signature file are public and can be accessed by anyone using the module. Any that are not in the signature file are private and can be used only inside the module itself. The typical way to create a signature file is to generate it from the module source and then go through and erase any values and functions that you want to be private.

The signature file name must be the same as the name of the module with which it is paired. It must have the extension .fsi or .mli. You must specify the signature file to the compiler. On the command line, you must give it directly before the source file for its module. In Visual Studio, the signature file must appear before the source file in Solution Explorer.

For example, if you have the following in the file Lib.fs:

#light
let funkyFunction x =
    x + ": keep it funky!"

let notSoFunkyFunction x = x + 1

and you want to create a library that exposes funkyFunction but not notSoFunkyFunction, you would use the signature code like this:

val funkyFunction : string -> string

and would use the command line like this:

fsc -a Lib.fsi Lib.fs

which results in an assembly named Lib.dll with one class named Lib, one public function named funkyFunction, and one private function named notSoFunkyFunction.

Module Scope

The order that modules are passed to the compiler is important, because it affects the scope of identifiers within the modules and the order in which the modules are executed. I cover scope in this section and execution order in the next.

Values and types within a module cannot be seen from another module unless the module they're in appears on the command line before the module that refers to them. This is probably easier to understand with an example. Suppose you have a source file, ModuleOne.fs, containing the following:

#light
module ModuleOne

let text = "some text"

and another module, ModuleTwo.fs, containing the following:

#light
module ModuleTwo

print_endline ModuleOne.text

These two modules can be compiled successfully with the following:

fsc ModuleOne.fs ModuleTwo.fs -o ModuleScope.exe

But the following command:

fsc ModuleTwo.fs ModuleOne.fs -o ModuleScope.exe

would result in this error message:


ModuleTwo.fs(3,17): error: FS0039: The namespace or module 'ModuleOne' is not  defined.

This is because ModuleOne is used in the definition of ModuleTwo, so ModuleOne must appear before ModuleTwo in the command line, or else ModuleOne will not be in scope for ModuleTwo.

Visual Studio users should note that the order in which files appear in Solution Explorer is the order that they are passed to the compiler. This means it is sometimes necessary to spend a few moments rearranging the order of the files when adding a new file to a project.

Module Execution

Roughly speaking, execution in F# starts at the top of a module and works its way down to the bottom. Any values that are functions are calculated, and any top-level statements are executed. So, the following:

module ModuleOne

print_endline "This is the first line"

print_endline "This is the second"

let file =
    let temp = new System.IO.FileInfo("test.txt") in
    Printf.printf "File exists: %b " temp.Exists;
    temp

will give the following result:


This is the first line
This is the second
File exists: false

This is all pretty much as you might expect. When a source file is compiled into an assembly, none of the code in it will execute until a value from it is used by a currently executing function. Then, when the first value in the file is touched, all the let expressions and top-level statements in the module will execute in their lexical order. When a program is split over more than one module, the last module passed to the compiler is special. All the items in this module will execute, and the other items will behave as they were in an assembly. Items in other modules will execute only when a value from that module is used by the module currently executing. Suppose you create a program with two modules.

This is ModuleOne.fs:

#light
module ModuleOne

print_endline "This is the third and final"

This is ModuleTwo.fs:

#light
module ModuleTwo

print_endline "This is the first line"

print_endline "This is the second"

If this is compiled with the following:

fsc ModuleOne.fs ModuleTwo.fs -o ModuleExecution.exe

this will give the following result:


This is the first line
This is the second

This might not be what you expected, but it is important to remember that since ModuleOne was not the last module passed to the compiler, nothing in it will execute until a value from it is used by a function currently executing. In this case, no value from ModuleOne is ever used, so it never executes. Taking this into account, you can fix your program so it behaves more as you expect.

Here is ModuleOne.fs:

module ModuleOne

print_endline "This is the third and final"

let n = 1

Here is ModuleTwo.fs:

module ModuleTwo

print_endline "This is the first line"

print_endline "This is the second"

let funct() =
    Printf.printf "%i" ModuleOne.n

funct()

If this is compiled with the following:

fsc ModuleOne.fs ModuleTwo.fs -o ModuleExecution.exe

it will give the following result:


This is the first line
This is the second
This is the third and final
1

However, using this sort of trick to get the results you want is not recommended. It is generally best only to use statements at the top level in the last module passed to the compiler. In fact, the typical form of an F# program is to have one statement at the top level at the bottom of the last module that is passed to the compiler.

Optional Compilation

Optional compilation is a technique where the compiler can ignore various bits of text from a source file. Most programming languages support some kind of optional compilation. It can be handy, for example, if you want to build a library that supports both .NET 1.1 and 2.0 and want to include extra values and types that take advantage of the new features of version 2.0. However, you should use the technique sparingly and with great caution, because it can quickly make code difficult to understand and maintain.

In F# optional compilation is supported by the compiler switch --define FLAG and the command #if FLAG in a source file.

The following example shows how to define two different versions of a function, one for when the code is compiled for .NET 2.0 and the other for all other versions of .NET:

#light
open Microsoft.FSharp.Compatibility

#if FRAMEWORK_AT_LEAST_2_0
let getArray() = [|1; 2; 3|]
#else
let getArray() = CompatArray.of_array [|1; 2; 3|]
#endif

This example assumes that the compiler switch --define FRAMEWORK_AT_LEAST_2_0 is defined when the code is being compiled for .NET 2.0. Chapter 13 gives more on the differences between .NET 1.0, 1.1, and 2.0 and covers compiler options.

Comments

F# provides two kinds of comments. Multiline comments start with a left parenthesis and an asterisk and end with an asterisk and a right parenthesis. For example:

(* this is a comment *)

or

(* this
   is a
   comment
*)

You cannot nest multiline comments, and you will get a compile error if a comment is left unclosed. Single-line comments start with two slashes and extend to the end of a line. For example:

// this is a single-line comment

Doc Comments

Doc comments allow comments to be extracted from the source file in the form of XML or HTML. This is useful because it allows programmers to browse code comments without having to browse the source. This is convenient for the vendors of APIs because it allows them to provide documentation about the code without having to provide the source itself, and it is just more convenient to be able to browse the docs without having to open the source. In addition, the documentation is stored alongside the source where it has more chance of being updated when code changes.

Doc comments start with three slashes instead of two. They can be associated only with top-level values or type definitions and are associated with the value or type they appear immediately before. The following code associates the comment this is an explanation with the value myString:

#light
/// this is an explanation
let myString = "this is a string"

To extract doc comments into an XML file, you use the −doc compiler switch. If this example were saved in a source file, prog.fs, the following command:

fsc -doc doc.xml Prog.fs

would produce the following XML:

<?xml version="1.0" encoding="utf-8"?>
<doc>
<assembly><name>Prog</name></assembly>
<members>
<member name="F:Prog.myString">
<summary>
 this is an explanation
</summary>

</member>
<member name="T:Prog">

</member>
</members>
</doc>

You can then process the XML using various tools, such as NDoc (ndoc.sourceforge.net), to transform it into a number of more readable formats. The compiler also supports the direct generation of HTML from doc comments. Although this is less flexible than XML, it can produce usable documentation with less effort. It can also produce better results, under some circumstances, because notations such as generics and union types are not always well supported by documentation generation tools. I cover the compiler switches that generate HTML in "Useful Command-Line Switches" in Chapter 12.

In F# there is no need to explicitly add any XML tags; for example, the <summary> and </summary> tags were added automatically. I find this useful because it saves a lot of typing and avoids wasted space in the source file; however, you can take control and write out the XML tags themselves if you want. The following is a doc comment where the tags have been explicitly written out:

#light
/// <summary>
/// divides the given parameter by 10
/// </summary>
/// <param name="x">the thing to be divided by 10</param>
let divTen x = x / 10

This will produce the following XML:

<?xml version="1.0" encoding="utf-8"?>
<doc>
<assembly><name>AnotherProg</name></assembly>
<members>
<member name="M:AnotherProg.divTen (System.Int32)">
<summary>
divides the given parameter by 10
</summary>
<param name="x">the thing to be divided by 10</param>
</member>
<member name="T:AnotherProg">

</member>
</members>
</doc>

If no signature file exists for the module file, then the doc comments are taken directly from the module file itself. However, if a signature file exists, then doc comments come from the signature file. This means that even if doc comments exist in the module file, they will not be included in the resulting XML or HTML if the compiler is given a signature file for it.

Custom Attributes

Custom attributes add information to your code that will be compiled into an assembly and stored alongside your values and types. This information can then be read programmatically via reflection or by the runtime itself.

Attributes can be associated with types, members of types, and top-level values. They can also be associated with do statements. An attribute is specified in brackets, with the attribute name in angle brackets. For example:

[<Obsolete>]

Attribute names, by convention, end with the string Attribute, so the actual name of the Obsolete attribute is ObsoleteAttribute.

An attribute must immediately precede what it modifies. The following code marks the function, functionOne, as obsolete:

#light
open System

[<Obsolete>]
let functionOne () = ()

An attribute is essentially just a class, and when you use an attribute, you are really just making a call to its constructor. In the previous example, Obsolete has a parameterless constructor, and it can be called with or without parentheses. In this case, we called it without parentheses. If you want to pass arguments to an attribute's constructor, then you must use parentheses and separate arguments with commas. For example:

#light
open System

 [<Obsolete("it is a pointless function anyway!")>]
let functionTwo () = ()

Sometimes an attribute's constructor does not expose all the properties of the attribute. If you want to set them, you need to specify the property and a value for it. You specify the property name, an equals sign, and the value after the other arguments to the constructor. The next example sets the Unrestricted property of the PrintingPermission attribute to true:

#light
open System.Drawing.Printing
open System.Security.Permissions

[<PrintingPermission(SecurityAction.Demand, Unrestricted = true)>]
let functionThree () = ()

You can use two or more attributes by separating the attributes with semicolons:

#light
open System
open System.Drawing.Printing
open System.Security.Permissions

[<Obsolete; PrintingPermission(SecurityAction.Demand)>]
let functionFive () = ()

So far, we've used attributes only with values, but using them with type or type members is just as straightforward. The following example marks a type and all its members as obsolete:

#light
open System

[<Obsolete>]
type OOThing = class
    [<Obsolete>]
    val stringThing : string
    [<Obsolete>]
    new() = {stringThing = ""}
    [<Obsolete>]
    member x.GetTheString () = x.string_thing
end

If you intend to use WinForms or Windows Presentation Foundation (WPF) graphics within your program, you must ensure that the program is a single-thread apartment. This is because the libraries that provide the graphical components use unmanaged (not compiled by the CLR) code under the covers. The easiest way to do this is by using the STAThread attribute. This must modify the first do statement in the last file passed to the compiler, that is, the first statement that will execute when the program runs. For example:

#light
open System
open System.Windows.Forms

let form = new Form()

[<STAThread>]
do Application.Run(form)



Note The do keyword is usually required only when not using the #light mode; however, it is also required when applying an attribute to a group of statements.


Once attributes have been added to types and values, it's possible to use reflection to find which values and types are marked with which attributes. This is usually done with the IsDefined or GetCustomAttributes methods of the System.Reflection.MemberInfo class, meaning they are available on most objects used for reflection including System.Type. The next example shows how to look for all types that are marked with the Obsolete attribute:

#light
let obsolete = System.AppDomain.CurrentDomain.GetAssemblies()
            |> List.of_array
            |> List.map ( fun assm -> assm.GetTypes() )
            |> Array.concat
            |> List.of_array
            |> List.filter
                ( fun m ->
                    m.IsDefined((type System.ObsoleteAttribute), true))

print_any obsolete

The results are as follows:


[System.ContextMarshalException; System.Collections.IHashCodeProvider;
 System.Collections.CaseInsensitiveHashCodeProvider;
 System.Runtime.InteropServices.IDispatchImplType;
 System.Runtime.InteropServices.IDispatchImplAttribute;
 System.Runtime.InteropServices.SetWin32ContextInIDispatchAttribute;
 System.Runtime.InteropServices.BIND_OPTS;
 System.Runtime.InteropServices.UCOMIBindCtx;
 System.Runtime.InteropServices.UCOMIConnectionPointContainer;
...

Now that you've seen how you can use attributes and reflection to examine code, let's look at a similar but more powerful technique for analyzing compiled code, called quotation.

Quoted Code

Quotations are a way of telling the compiler, "Don't generate code for this section of the source file; turn it into a data structure, an expression tree, instead." This expression tree can then be interpreted in a number of ways, transformed or optimized, compiled into another language, or even just ignored.

Quotations come in two types, raw and typed, the difference being that typed quotations carry static type information whereas raw quotations do not. Both carry runtime type annotations. Typed quotations are designed for use in client programs, so usually you will want to use typed quotations. These are the only quotations covered in this section. Raw quotations are designed for implementing libraries based on quotations; these will generally be automatically typed quotations before they are consumed.

To quote an expression, place it between guillemets (also called French quotes), « ». To ensure the compiler recognizes these characters, you must save your file as UTF-8. Visual Studio users can do this with File image Advanced Save. If you have some objection to using UTF-8, you can use an ASCII alternative: <@ @>. Both ways of quoting an expression are essentially just an operator defined in a module, so you need to open the module Microsoft.FSharp.Quotations.Typed to be able to use them. The next example uses a quotation and prints it:

#light
open Microsoft.FSharp.Quotations.Typed

let quotedInt = « 1 »

printf "%A " quotedInt

The result is as follows:


<@ Int32 1 @>

If you were to use the ASCII alternative, it would look like this:

#light
open Microsoft.FSharp.Quotations.Typed

let asciiQuotedInt = <@ 1 @>

printf "%A " asciiQuotedInt

The result is as follows:


<@ Int32 1 @>

As you can see, the code doesn't look very different and the results are the same, so from now I'll use guillemets. The following example defines an identifier and uses it in a quotation:

#light
open Microsoft.FSharp.Quotations.Typed

let n = 1
let quotedId = « n »

printf "%A " quotedId

The result is as follows:


<@ Prog.n @>

Next we'll quote a function applied to a value. Notice that since we are quoting two items, the result of this quotation is split into two parts. The first part represents the function, and the second represents the value to which it is applied.

#light
open Microsoft.FSharp.Quotations.Typed

let inc x = x + 1
let quotedFun = « inc 1 »

printf "%A " quotedFun

The result is as follows:


<@ Prog.inc (Int32 1) @>

The next example shows an operator applied to two values. Because the expression has three items, the result is split into three parts, one to represent each part of the expression.

#light
open Microsoft.FSharp.Quotations.Typed

let quotedOp = « 1 + 1 »

printf "%A " quotedOp

The result is as follows:


<@ Microsoft.FSharp.Operators.op_Addition (Int32 1) (Int32 1) @>

The next example quotes an anonymous function:

#light
open Microsoft.FSharp.Quotations.Typed

let quotedAnonFun = « fun x -> x + 1 »

printf "%A " quotedAnonFun

The result is as follows:


<@
fun  x#6142.1 ->
  Microsoft.FSharp.Operators.op_Addition x#6142.1 (Int32 1) @>

To interpret expressions, you must first convert them into raw expressions and then query the expression to see whether it is of a certain type. Querying the type returns an option type that will contain the value Some if it is of that type or None if it isn't. The next example defines a function, interpretInt, that queries the expression passed to it to see whether it is an integer. If it is, it prints the value of that integer; otherwise, it prints the string "not an int".

#light
open Microsoft.FSharp.Quotations
open Microsoft.FSharp.Quotations.Typed

let interpretInt exp =
    let uexp = to_raw exp in
    match uexp with
    | Raw.Int32 x -> printfn "%d" x
    | _ -> printfn "not an int"

interpretInt « 1 »
interpretInt « 1 + 1 »

The results are as follows:


1
not an int

We printed two expressions with interpretInt. The first was an integer value, so it printed out the value of that integer. The second was not an integer, although it contained integers.

Quotations are a very big topic, and we can't cover them completely in this section or even in this book. You will, however, return to them in "Meta-programming with Quotations" in Chapter 11.

Summary

In this chapter, you saw how to organize code in F#. You also saw how to comment, annotate, and quote code, but you just scratched the surface of both annotation and quoting.

This concludes the tour of the F# core language. The rest of the book will focus on how to use F#, from working with relational databases to creating user interfaces, after you look at the F# core libraries in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset