Chapter 3. Primitive Types

WHAT'S IN THIS CHAPTER?

  • Understanding primitive types

  • Declaring primitive type instances

  • Applying operators

Like all languages that run on top of the CLR, the F# language provides a core set of primitive types that offer basic integer and floating-point arithmetic capabilities, character string support, Boolean types, and so on. In general, these map to the corresponding CLS types (System.Int16, System.Int32, and so on), as described next, but a few types are new to F# and come from the F# libraries. These types are fully accessible to other languages, such as C# and Visual Basic but obviously have no native language support there and need to be used as any other .NET type is (that is, via fully qualified type names).

BOOLEAN

Probably the simplest primitive type in F# is the bool type, which corresponds to the CLR's underlying System.Boolean type, and has two possible values, true and false.

Booleans support the usual range of logical operations, including && (logical AND) and || (logical OR), and otherwise behave just as Boolean values do in any other .NET language.

NUMERIC TYPES

F# supports a wide range of numeric types, 8 bits in size to 64, in both signed and unsigned versions, as shown here:

TYPE

DESCRIPTION

.NET NAME

LITERALS

Byte

8-bit unsigned integer

System.Byte

3uy, 0xFuy

Sbyte

8-bit signed integer

System.SByte

3y, 0xFy

int16

16-bit signed integer

System.Int16

3s, 0xFs

uint16

16-bi unsigned integer

System.UInt16

3us, 0xFus

int, int32

32-bit signed integer

System.Int32

3, 0xF

uint32

32-bit unsigned integer

System.UInt32

3u, 0xFu

int64

64-bit signed integer

System.Int64

3L, 0xFL

uint64

64-bit unsigned integer

System.UInt64

3UL, 0xFUL

nativeint

Machine-sized integer

System.IntPtr

3n, 0xB8000n

unativeint

Machine-sized unsigned integer

System.UIntPtr

3un, 0xB8000un

bigint

Arbitrarily large integer

System.Numerics.BigInteger

3I

Each of these types can be initialized to decimal, hexadecimal, octal, or binary constants. Decimal constants are represented simply with the numeric value itself, whereas the other three must be prefixed with a flag indicating whether it should be hexadecimal (0x or 0X), octal (0o or 0O), or binary (0b or 0B). This means that the following literals, 0XF, 0o20, 15 and 0b1111, are all the same value (15).

For each of these types, with the exception of the bigint type, the traditional algebraic operators are supported, providing unchecked (that is, wraparound in the event the value exceeds the available representation size) operations for addition, subtraction, multiplication, division, and modulo. Operations that should throw an exception if they overflow (of type System.OverflowException) are defined in the Microsoft.FSharp.Core.Operators module; opening modules and using operations defined therein is discussed in more detail in Chapter 11. Any sort of integer division by zero raises a standard System.DivideByZeroException.

Because of the dangers of overflow, even with the largest-precision types, the bigint type is the preferred type for handling exceedingly large values, such as the total size of the U.S. budget or the royalty checks for programming language book authors. (Technically, the bigint type isn't a primitive type, according to the language specification, but given its syntax and role, it's helpful to think of it as such for all practical purposes.)

The nativeint and unativeint types are typically used only for interoperability with native code that receives and produces machine word-sized values, that is, pointers. They are rarely, if ever, used for arithmetic purposes.

The Microsoft.FSharp.Core.Operators module also defines a number of mathematical operations, listed here, which behave as their names imply. (These operators are also defined for the floating-point types, described later.)

  • abs

  • cos

  • sin

  • tan

  • cosh

  • sinh

  • tanh

  • acos

  • asin

  • atan

  • ceil

  • floor

  • truncate

  • exp

  • log

  • log10

  • **

This is not an exhaustive list, but a representative sample of the operators found in that namespace. Each of these behaves as its name implies; ceil returns the ceiling (rounded up), floor returns the floor (rounded down), and truncate returns the rounded (traditional closest-to-zero semantics) integer value for floating-point values. Exponentiation (power) is done using the ** operator.

Opening a module (needed in order to use these operators) is discussed in Chapter 11.

In addition, the following comparison operations are all defined:

  • <

  • <=

  • >

  • >=

  • =

  • <>

  • min

  • max

And again, each behaves as its name implies. (C# and C++ developers, take special note that equality uses one =, not two, and that not-equals uses <> instead of the C-family !=. Assignment is done differently in F#, as discussed in Chapter 13.)

BITWISE OPERATIONS

All the previous integer types support bitwise operations — operations that take into account the underlying bitwise representation — such as AND, OR, eXclusive OR, and so on. The operators to carry out these operations are definitely nontraditional, compared to the C family of languages, but aren't difficult to understand or follow. Consider the following:

OPERATOR

DESCRIPTION

EXAMPLE (WITH RESULTS)

&&&

bitwise AND

0b1111 &&& 0b1100 -> 0b1100

|||

bitwise OR

0b1111 ||| 0b1100 -> 0b1111

^^^

bitwise exclusive OR

0b1111 ^^^ 0b1100 -> 0b0011

~~~

bitwise NOT

~~~ 0b11110000uy = 0b00001111uy

<<<

bitwise shift left

0b0110 <<< 1 -> 0b1100

>>>

bitwise shift right

0b0110 >>> 1 -> 0b0011

Generally, bitwise operations are not necessary in F# because their principal use in traditional C/C++ code was to carry a variety of "flag"-style information or concise values packed into a single variable to save space; "flags" are typically better represented in other ways in F# (see Chapter 5 for details), and the CLR will do its own packaging of values to save space, so such measures are often counterproductive.

FLOATING-POINT TYPES

Floating-point types in the F# language hold values that should not or cannot be rounded up to whole numbers.

TYPE

DESCRIPTION

.NET NAME

LITERALS

single, float32

32-bit IEEE floating-point

System.Single

3.2f, 1.3e4f

double, float

64-bit IEEE floating-point

System.Double

3.2, 1.3e4

decimal

High-precision decimal

System.Decimal

19M, 3.2M

bignum

Arbitrary-precision rationals

Microsoft.FSharp.Math.bignum

19N, 3.2N

Like all CLR-based languages that use the System.Single and System.Double IEEE-based types, floating-point arithmetic is inherently inaccurate — adding the values 1.0 and 1.0 does not necessarily produce 2.0, but could produce 1.9999999. For this reason, any operations that require high accuracy should use the decimal or bignum types instead of single/float32 or double/float.

The bignum type, unlike decimal or single or double, does not store its representation in a decimal format, instead preferring to store it as an actual fraction, tracking both numerator and denominator. This guarantees the highest degree of precision, but at the cost of having to do the fractional mathematics directly when looking to convert it to a floating-point representation. Fortunately, F# supports all the major mathematical operations on bignum types, performing the appropriate fractional math.

Note

With the release of Visual Studio 2010, Microsoft moved the bignum definition to the F# PowerPack, which is a useful and near-mandatory set of supplemental material available for free download at http://fsharppowerpack.codeplex.com.

Note also that F# understands two special floating-point constants, (positive) Infinity and (negative) -Infinity, which will be the result of any floating-point division by zero.

ARITHMETIC CONVERSIONS

Unlike the C-family of languages, F# will not do implicit type conversion among numeric types, instead requiring manual conversion. This is different from C# or C++, where conversion from integer to floating-point values is commonly expected; this means that the following C# code:

int x = 12;
float y = 2.0;
var result = x / y; // returns 6.0

when translated to F#, has to be explicitly converted, like this:

let i = 4  // int constant
let f = 4.0 // float32 constant
// let result = i * f will fail; i and f not the same type
let result = (float32 i) / f

The conversion is done by explicitly naming the type to convert to, in much the same way that a downcast is written in C#, without the parentheses. Note, however, that the parentheses around the conversion are necessary — without them, the compiler sees the multiplication first, and tries to multiply i and f and then take the result and convert it to a float32, which will fail to compile, because again now i and f are not the same type.

STRING AND CHARACTER TYPES

F#, like all CLR-based languages, has an intrinsic notion of a string type, a sequence of characters manipulated as a single entity. The string type is a synonym for the System.String type from the Base Class Library and supports all the methods defined there; therefore, to obtain the length of a string, simply use the Length property, just as C# or Visual Basic would do.

TYPE

DESCRIPTION

LITERALS

.NET NAME

string

String

"Katie"

System.String

byte[]

Literal byte array

"ABCD"B

System.Byte[]

char

Character

'c'

System.Char

Note that strings can be either "escaped," meaning that strings normally recognize the backslash character () as an "escape" to allow for nonprintable character sequences (such as the linefeed or newline characters) in strings, or "verbatim," meaning that the string contents are never escaped. Verbatim strings must be prefixed with the @ character (just as in C#), as in:

// This is an escaped string--double backslashes are
// necessary to represent a single backslash character
"C:\Prg\FSharp\Examples" // escaped string;

// Verbatim string--no escaping takes place
@"C:PrgFSharpExamples"

Note that in F#, strings can also span lines without having to close off the string, re-open a new string on the next line, and concatenate the two (as is necessary in C#). This is known as a multi-line string literal.

The literal byte array type is useful when working with binary protocols and file formats, particularly for magic numbers and begin/end sequences that appear in the content stream.

Strings can also be concatenated using the + operator or the .NET Framework Class Library class System.Text.StringBuilder, just as other CLR languages can.

Because the F# string is a System.String at the CLR level,[2] all the members of the System.String class are also accessible to F# code, so the usual litany of operations familiar to C# and Visual Basic programmers, such as the Length property to return the length of the string, are all accessible. More on F# compatibility and interoperability with other CLR languages is given in Chapter 18.

UNIT

The Unit type in F# is a special type, one that has no direct equivalent in traditional object-oriented or imperative programming languages. In practice, to the C# and Visual Basic developer, Unit is a combination of both null and System.Void. In essence, unit is the type that represents no type (similar to System.Void) and has one value only (given by the literal (), similar to null). It is used in those situations in which the value returned from an expression needs to represent the case where there is no value to be returned.[3]

Developers familiar with C# and Visual Basic may find the general disdain for null and void to be confusing at first; null, in particular, is a staple resource for those languages to indicate a lack of response in a return value. F# provides alternative ways to represent the lack of a response — the option type — and is described in more detail in Chapter 5.

Note

Note that F# does support the keywords null and void, and they are used as one might expect — the first as a value and the second as a type, but the principal use for these two centers around the area of .NET interoperability. More on null and void can be found on Chapter 18.

UNITS OF MEASURE TYPES

In addition to the primitive types provided here, F# provides a feature, colloquially known as units-of-measure, that allows an F# programmer to annotate an instance of a primitive type with some additional information intended to describe the "units" for this value. This is intended to better support the real-world, in which calculations frequently are done with a unit system either explicitly or implicitly applied to the calculation.

For example, consider a function in a physics simulation program that needs to calculate the trajectory of an artillery shell fired from a gun.[4] The shell will have an initial velocity, but this velocity will decrease over time, based on its angle and the pull of gravity.

Without getting into the mathematics too deeply, several different "units" are being expressed here, and if the programmer is not careful, mistakes in the code can appear if the right values are not converted to the appropriate "units" type during the calculations. This is less trivial than it might seem at first — both space programs (NASA and the European Space Agency, to name a few) and financial institutions have suffered losses measured in millions of $US because of flawed unit-based calculations.

Defining a new unit of measure requires a simple declaration of what name the unit-of-measure will use, annotated by the Measure attribute recognized by the compiler. (Attributes are described in more detail in Chapter 12.) When declared, this unit-of-measure can be used to "annotate" a primitive type value or variable (typically of float, float32, or decimal type, though signed integer types are also acceptable) and provide the additional type checking to ensure that units-of-measure are not combined in illegal ways. So, given the following declaration:

[<Measure>] type usd
[<Measure>] type euro

the compiler recognizes two new unit-of-measure types to be defined, one representing (presumably) U.S. dollars, the other, European euro.

Thus, the following function defines a usd-to-euro conversion, and the compiler understands the unit-of-measure conversions as part of the function's signature:

let usdRoyaltyCheck = 1500000.00<usd>
let usdToEuro (dollars : float<usd>) =
    dollars * 1.5<euro/usd>

When described in the compiler's Intellisense window, it clearly indicates that the usdToEuro function takes a single parameter of type float<usd> as input and returns a value of type float<euro> as the result from the function. It knows this by virtue of the conversion constant being defined as a unit-of-measure that, as all conversion constants do, is expressed as a ratio of <euro> to <usd>, in this case, 1.5 <euro> to the <usd>.

Note that this doesn't mean that the F# compiler has built-in knowledge of physics or accounting or mathematics or any other domain — the units are simply parsed and compared as-is, leaving F# developers free to create their own units and unit systems as necessary or desirable. The units can be called by any legitimate identifier, and no particular relationship is assumed by their names, so that <m> and <km> aren't intrinsically understood — the programmer seeking to convert <m> to <km> must write that function explicitly.

LITERAL VALUES

It's important to note that the F# compiler takes great pains to hide some of the physical characteristics of the mappings to the underlying CLR from the developer.

For example, consider the following F# code:

let s = "Hello world!"

Contrary to what the C# or Visual Basic developer assumes, this does not create a constant value, but a property whose contents are pre-initialized to the value previously defined.

Normally, this is not a problem; this is arguably a good thing — C# and Visual Basic developers spend far too much time thinking about the physical layout characteristics of their code, explicitly declaring fields and properties as separate entities, when 95% of the time the two will map in a one-to-one manner. Even given the presence of automatically generated properties in C# 3.0, the developer must still think explicitly about physical layout — for example, should a name intended to yield a constant value be a property, a field, or a method? Should it yield a singleton object via a static method? And so on. By removing some of these "low-level" issues from the language syntax, F# manages to avoid much of the unnecessary debate around those decisions.

There are a few cases, most notably in C#/F# interop (see Chapter 18) and pattern matching (see Chapter 6) where ensuring that a name/value binding is defined as a constant field value is necessary; to do this, annotate the name with the Literal attribute:

[<Literal>]
let S2 = "Hello world again!"

This now forces the F# compiler to compile s2 as a constant static field in the class created for the F# file.

SUMMARY

F# has a similar set of basic types to that of other CLR-based languages, with some slight differences in syntax and semantics, and some extended types that the traditional CLR languages (C#, Visual Basic) don't have directly. Many of these additional types were originally created to support F#'s original research role as a "math/science" language but turn out to be useful in the general programming space as well; for example, bigint will be useful in accounting applications, and both decimal and bignum will have particular application for monetary calculations and high-precision mathematics.

F# developers can also find the units-of-measure capabilities within the language to be helpful anywhere real-world calculations are done — obviously mathematical calculations, such as those routinely done in physics (either simulators or guidance-control software) can find units-of-measure useful but so will accounting programs, particularly those that deal with a known set of currencies or calculations dealing with time.



[2] Actually, this isn't quite 100% true — an F# string is an instance of F#'s own string type, but all the System.String members are available on an F# string, so practically the statement holds true.

[3] Obviously, this is a confusing statement and isn't necessary for the practicing programmer to spend a lot of time worrying about — simply know that when void or null might have applied in C#, or Nothing in VB, use unit and () instead.

[4] The full source for this program is found on Chris Smith's blog, under the name "Burning Land."

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset