Chapter 1. Introduction to typing

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 1. Introduction to typing

This chapter covers

Why type systems exist
Benefits of strongly typed code
Types of type systems
Common features of type systems

The Mars Climate Orbiter disintegrated in the planet’s atmosphere because a component developed by Lockheed produced momentum measurements in pound-force seconds (U.S. units), whereas another component developed by NASA expected momentum to be measured in Newton seconds (metric units). Using different types for the two measures would have prevented the catastrophe.

As we will see throughout this book, type checkers provide powerful ways to eliminate whole classes of errors, provided they are given enough information. As software complexity increases, so does the need to provide better correctness guarantees. Monitoring and testing can show that the software is behaving according to spec at a given point in time, given specific input. Types give us more general proofs that the code will behave according to spec regardless of input.

Programming language research is coming up with ever-more-powerful type systems. (See, for example, languages like Elm and Idris.) Haskell is gaining in popularity. At the same time, there are ongoing efforts to bring compile-time type checking to dynamically typed languages: Python added support for type hints, and TypeScript is a language created for the sole purpose of providing compile-time type checking to JavaScript.

There clearly is value in typing code, and leveraging the features of the type systems that your programming languages provide will help you write better, safer code.

1.1. Whom this book is for

This is a book for practicing programmers. You should be comfortable writing code in a mainstream programming language like Java, C#, C++, or JavaScript/TypeScript. The code examples in this book are in TypeScript, but most of the content is language-agnostic. In fact, the examples don’t always use idiomatic TypeScript. Where possible, code examples are written to be accessible to programmers coming from other languages. See appendix A for how to build the code samples in this book and appendix B for a short TypeScript cheat sheet.

If you are developing object-oriented code at your day job, you might have heard of algebraic data types (ADTs), lambdas, generics, functors, or monads, and would like to better understand what these are and how they are relevant to your work.

This book will teach you how to rely on the type system of your programming language to design code that is less error-prone, better componentized, and easier to understand. We’ll see how errors which could happen at run time and cause an entire system to malfunction can be transformed into compilation errors and caught before they can cause any damage.

A lot of the literature on type systems is formal. This book focuses on practical applications of type systems; thus, math is kept to a minimum. That being said, you should be familiar with basic algebra concepts like functions and sets. We will rely on these to explain some of the relevant concepts.

1.2. Why types exist

At the low level of hardware and machine code, the program logic (the code) and the data it operates on are both represented as bits. At this level, there is no difference between the code and the data, so errors can easily happen when the system mistakes one for the other. These errors range from program crashes to severe security vulnerabilities in which an attacker “tricks” the system into executing their input data as code.

An example of this kind of loose interpretation is the JavaScript eval() function, which evaluates a string as code. It works well when the string provided is valid Java-Script code but causes a run-time error when it isn’t, as shown in the next listing.

Listing 1.1. Trying to interpret data as code

console.log(eval("40+2"));           1

console.log(eval("Hello world!"));   2

1 Prints “42” to the console
2 Raises “SyntaxError: unexpected token: identifier”

1.2.1. 0s and 1s

Beyond distinguishing between code and data, we need to know how to interpret a piece of data. The 16-bit sequence 1100001010100011 can represent the unsigned 16-bit integer 49827, the signed 16-bit integer -15709, the UTF-8 encoded character '£', or something completely different, as we can see in figure 1.1. The hardware our programs run on stores everything as sequences of bits, so we need an extra layer to give meaning to this data.

Figure 1.1. A sequence of bits can be interpreted in multiple ways.

Types give meaning to this data and tell our software how to interpret a given sequence of bits in a given context so that it preserves the intended meaning.

Types also constrain the set of valid values a variable can take. A signed 16-bit integer can represent any integer value from -32768 to 32767 but nothing else. The ability to restrict the range of allowed values helps eliminate whole classes of errors by not allowing invalid values to appear at run time, as shown in figure 1.2. Viewing types as sets of possible values is important to understanding many of the concepts covered in this book.

Figure 1.2. The sequence of bits typed as a signed 16-bit integer. The type information (16-bit signed integer) tells the compiler and/or run time that the sequence of bits represents an integer value between `-32768` and `32767`, ensuring the correct interpretation as `-15709`.

As we will see in section 1.3, many other safety properties are enforced by the system when we add properties to our code, such as marking a value as const or a member as private.

1.2.2. What are types and type systems?

Because this book talks about types and type systems, let’s define these terms before moving forward.

Type

A type is a classification of data that defines the operations that can be done on that data, the meaning of the data, and the set of allowed values. Typing is checked by the compiler and/or run time to ensure the integrity of the data, enforce access restrictions, and interpret the data as meant by the developer.

In some cases, we will simplify our discussion and ignore the operations part, so we’ll look at types simply as sets, which represent all the possible values an instance of that type can take.

Type System

A type system is a set of rules that assigns and enforces types to elements of a programming language. These elements can be variables, functions, and other higher-level constructs. Type systems assign types through notation you provide in the code or implicitly by deducing the type of a certain element based on context. They allow various conversions between types and disallow others.

Now that we’ve defined types and type systems, let’s see how the rules of a type system are enforced. Figure 1.3 shows, at a high-level, how source code gets executed.

Figure 1.3. Source code is transformed by a compiler or interpreter into code that can be executed by a run time. The run time is a physical computer or a virtual machine, such as Java’s JVM, or a browser’s JavaScript engine.

At a very high level, the source code we write gets transformed by a compiler or interpreter into instructions for a machine, or run time. This run time can be a physical computer, in which case the instructions are CPU instructions, or it can be a virtual machine, with its own instruction set and facilities.

Type Checking

The process of type checking ensures that the rules of the type system are respected by the program. This type checking is done by the compiler when converting the code or by the run time while executing the code. The component of the compiler that handles enforcement of the typing rules is called a type checker.

If type checking fails, meaning that the rules of the type system are not respected by the program, we end up with a failure to compile or with a run-time error. We will go over the difference between compile-time type checking versus execution-time (or run-time) type checking in more detail in section 1.4.

Type checking and proofs

There is a lot of formal theory behind type systems. The remarkable Curry-Howard correspondence, also known as proofs-as-programs, shows the close connection between logic and type theory. It shows that we can view a type as a logic proposition, and a function from one type to another as a logic implication. A value of a type is equivalent to evidence that the proposition is true.

Take a function that receives as argument a boolean and returns a string.

Boolean to string

function booleanToString(b: boolean): string {
    if (b) {
        return "true";
    } else {
        return "false";
    }
}

This function can also be interpreted as “boolean implies string.” Given evidence of the proposition boolean, this function (implication) can produce evidence of the proposition string. Evidence of boolean is a value of that type, true or false. When we have that, this function (implication) will give us evidence of string as either the string "true" or the string "false".

The close relationship between logic and type theory shows that a program that respects the type system rules is equivalent to a logic proof. In other words, the type system is the language in which we write these proofs. The Curry-Howard correspondence is important because it brings logic rigor to the guarantees that a program will behave correctly.

1.3. Benefits of type systems

Because ultimately data is all 0s and 1s, properties of the data, such as how to interpret it, whether it is immutable, and its visibility, are type-level properties. We declare a variable as a number, and the type checker ensures that we don’t interpret its data as a string. We declare a variable as private or read-only, and although the data itself in memory is no different from public mutable data, the type checker can make sure we do not refer to a private variable outside its scope or try to change read-only data.

The main benefits of typing are correctness, immutability, encapsulation, composability, and readability. All five are fundamental features of good software design and behavior. Systems evolve over time. These features counterbalance the entropy that inevitably tries to creep into the system.

1.3.1. Correctness

Correct code means code that behaves according to its specification, producing expected results without creating run-time errors or crashes. Types help us add more strictness to the code to ensure that it behaves correctly.

As an example, let’s say we want to find the index of the string "Script" within another string. Without providing enough type information, we can allow a value of any type to be passed as an argument to our function. We are going to hit run-time errors if the argument is not a string, as the next listing shows.

Listing 1.2. Insufficient type information

function scriptAt(s: any): number {       1
    return s.indexOf("Script");
}

console.log(scriptAt("TypeScript"));      2
console.log(scriptAt(42));                3

1 Argument s has type any, which allows a value of any type.
2 This line correctly prints “4” to the console.
3 Passing a number as an argument causes a run-time TypeError.

The program is incorrect, as 42 is not a valid argument to the scriptAt function, but the compiler did not reject it because we hadn’t provided enough type information. Let’s refine the code by constraining the argument to a value of type string in the next listing.

Listing 1.3. Refined type information

function scriptAt(s: string): number {    1
    return s.indexOf("Script");
}

console.log(scriptAt("TypeScript"));
console.log(scriptAt(42));                2

1 Argument s now has type string.
2 Code fails to compile at this line due to type mismatch.

Now the incorrect program is rejected by the compiler with this error message:

Argument of type '42' is not assignable to parameter of type 'string'

Leveraging the type system, we transformed what used to be a run-time issue that could have been hit in production, affecting our customers, into a harmless compile-time issue that we must fix before deploying our code. The type checker makes sure we never try to pass apples as oranges; thus, our code becomes more robust.

Errors occur when a program gets into a bad state, which means that the current combination of all its live variables is invalid for whatever reason. One technique for eliminating some of these bad states is reducing the state space by constraining the number of possible values that variables can take, like in figure 1.4.

Figure 1.4. Declaring a type correctly, we can disallow invalid values. The first type is too loose and allows for values we don’t want. The second, more restrictive type won’t compile if the code tries to assign an unwanted value to a variable.

We can define the state space of a running program as the combination of all possible values of all its live variables. That is, the Cartesian product of the type of each variable. Remember, a type can be viewed as a set of possible values for a variable. The Cartesian product of two sets is the set comprised of all ordered pairs from the two sets.

Security

An important byproduct of disallowing potential bad states is more secure code. Many attacks rely on executing user-provided data, buffer overruns, and other such techniques, which can often be mitigated with a strong-enough type system and good type definitions.

Code correctness goes beyond eliminating innocent bugs in the code to preventing malicious attacks.

1.3.2. Immutability

Immutability is another property closely related to viewing our running system as moving through its state space. When we are in a known-good state, if we can keep parts of that state from changing, we reduce the possibility of errors.

Let’s take a simple example in which we attempt to prevent division by 0 by checking the value of our divisor and throwing an error if the divisor is 0, as shown in the following listing. If the value can change after we inspect it, the check is not very valuable.

Listing 1.4. Bad mutation

function safeDivide(): number {
    let x: number = 42;

     if (x == 0) throw new Error("x should not be 0");    1

    x = x - 42;                                           2

    return 42 / x;                                        3
}

1 Check if x is valid.
2 Bug: x becomes 0 after the check.
3 Division by 0 results in Infinity.

This happens all the time in real programs, in subtle ways: a variable gets changed concurrently by a different thread or obscurely by another called function. Just as in this example, as soon as a value changes, we lose any guarantees we were hoping to get from the checks we performed. Making x a constant, we get a compilation error when we try to mutate it in the next listing.

Listing 1.5. Immutability

function safeDivide(): number {
    const x: number = 42;                1

    if (x == 0) throw new Error("x should not be 0");

    x = x - 42;                          2

    return 42 / x;
}

1 x is declared using the keyword const instead of the keyword let.
2 This line no longer compiles as x is immutable and cannot be reassigned.

The bug is rejected by the compiler with the following error message:

Cannot assign to 'x' because it is a constant.

In terms of in-memory representation, there is no difference between a mutable and an immutable x. The constness property is meaningful only for the compiler. It is a property enabled by the type system.

Marking state that shouldn’t change as such by adding the const notation to our type prevents the kind of mutations with which we lose guarantees we previously checked for. Immutability is especially useful when concurrency is involved, as data races become impossible if data is immutable.

Optimizing compilers can emit more-efficient code when dealing with immutable variables, as their values can be inlined. Some functional programming languages make all data immutable: a function takes some data as input and returns other data without ever changing its input. In such cases, when we validate a variable and confirm that it is in a good state, we are guaranteed it will be in a good state for its whole lifetime. The trade-off, of course, is that we end up copying data when we could have operated on it in-place, which is not always desirable.

Making everything immutable might not always be feasible. That being said, making as much of the data immutable as you reasonably can will tremendously reduce the opportunity for issues such as preconditions not being met and data races.

1.3.3. Encapsulation

Encapsulation is the ability to hide some of the internals of our code, be it a function, a class, or a module. As you probably know, encapsulation is desirable, as it helps us deal with complexity: we split the code into smaller components, and each component exposes only what is strictly needed to the outside world, while its implementation details are kept hidden and isolated.

In the next listing, let’s extend our safe division example to a class that tries to ensure that division by 0 never happens.

Listing 1.6. Not enough encapsulation

class SafeDivisor {
    divisor: number = 1;

    setDivisor(value: number) {
        if (value == 0) throw new Error("Value should not be 0");    1

        this.divisor = value;
    }

    divide(x: number): number {
        return x / this.divisor;                                     2
    }
}

function exploit(): number {
    let sd = new SafeDivisor();

    sd.divisor = 0;                                                  3
    return sd.divide(42);                                            4
}

1 Ensure that divisor does not become 0 by checking value before assigning
2 Division by 0 should never happen.
3 Because the divisor member is public, the check can be bypassed.
4 Division by 0 returns Infinity.

In this case we can no longer make the divisor immutable, as we do want to give callers of our API the ability to update it. The problem is that callers can bypass the 0 check and directly set divisor to any value because it is visible to them. The fix in this case is to mark it as private and scope it to the class, as the following listing shows.

Listing 1.7. Encapsulation

class SafeDivisor {
    private divisor: number = 1;             1

    setDivisor(value: number) {
        if (value == 0) throw new Error("Value should not be 0");

        this.divisor = value;
    }

    divide(x: number): number {
        return x / this.divisor;
    }
}

function exploit() {
    let sd = new SafeDivisor();

    sd.divisor = 0;                          2
    sd.divide(42);
}

1 Member is now marked as private.
2 This line fails to compile as divisor can no longer be referenced outside the class.

A public and a private member have the same in-memory representation; the fact that the problematic code no longer compiles in the second example is simply due to the type notations we provided. In fact, public, private, and other visibility kinds are properties of the type in which they appear.

Encapsulation, or information hiding, enables us to split logic and data across a public interface and a nonpublic implementation. This is extremely helpful in large systems, as working against interfaces (or abstractions) reduces the mental effort it takes to understand what a particular piece of code does. We need to understand and reason about only the interfaces of components, not all their implementation details. It also helps by scoping nonpublic information within a boundary and guarantees that external code cannot modify it, as it simply does not have access to it.

Encapsulation appears at multiple layers: a service exposes its API as an interface, a module exports its interface and hides implementation details, a class exposes only its public members, and so on. Like nesting dolls, the weaker the relationship between two parts of the code, the less information they share. This strengthens the guarantees a component can make about the data it manages internally, as no outside code can be allowed to modify it without going through the component’s interface.

1.3.4. Composability

Let’s say we want to find the first negative number in an array of numbers and the first one-character string in an array of strings. Without thinking about how we can break down this problem into composable pieces and put them back together into a composable system, we could end up with two functions: findFirstNegativeNumber() and findFirstOneCharacterString(), as shown in the following listing.

Listing 1.8. Noncomposable system

function findFirstNegativeNumber(numbers: number[])
    : number | undefined {
    for (let i of numbers) {
        if (i < 0) return i;
    }
}

function findFirstOneCharacterString(strings: string[])
    : string | undefined {
    for (let str of strings) {
        if (str.length == 1) return str;
    }
}

The two functions search for the first negative number and for the first one-character string, respectively. If no such element is found, the functions return undefined (implicitly, by exiting the function without a return statement).

If a new requirement comes in that we should also log an error whenever we fail to find an element, we need to update both functions, as shown in the next listing.

Listing 1.9. Noncomposable system update

function findFirstNegativeNumber(numbers: number[])
    : number | undefined {
    for (let i of numbers) {
        if (i < 0) return i;
    }
    console.error("No matching value found");
}

function findFirstOneCharacterString(strings: string[])
    : string | undefined {
    for (let str of strings) {
        if (str.length == 1) return str;
    }
    console.error("No matching value found");
}

This is already less than ideal. What if we forget to apply the update everywhere? Such issues compound in large systems. Looking more closely at what each function does, we can tell that the algorithm is the same; but in one case, we operate on numbers with one condition, and in the other, we operate on strings with a different condition. We can provide a generic algorithm parameterized on the type it operates on and the condition it checks for, as shown in the following listing. Such an algorithm does not depend on the other parts of the system, and we can reason about it in isolation.

Listing 1.10. Composable system

function first<T>(range: T[], p: (elem: T) => boolean)
    : T | undefined {
    for (let elem of range) {
        if (p (elem)) return elem;
    }
}

function findFirstNegativeNumber(numbers: number[])
    : number | undefined {
    return first(numbers, n => n < 0);
}

function findFirstOneCharacterString(strings: string[])
    : string | undefined {
    return first(strings, str => str.length == 1);
}

Don’t worry if the syntax of this looks a bit strange; we’ll cover inline functions such as n => n < 0 in chapter 5 and generics in chapters 9 and 10.

If we want to add logging to this implementation, we need only to update the implementation of first. Better still, if we figure out a more efficient algorithm, simply updating the implementation benefits all callers.

As we’ll learn in chapter 10 when we discuss generic algorithms and iterators, we can make this function even more general. Currently, it only operates on an array of some type T. It can be extended to traverse any data structure.

If the code is not composable, we need a different function for each data type, data structure, and condition, even though they all fundamentally implement the same abstraction. Having the ability to abstract and then mix and match components reduces a lot of duplication. Generic types enable us to express these kinds of abstractions.

Having the ability to combine independent components yields a modular system and less code to maintain. Composability becomes important as the size of the code and the number of components increase. In a composable system, the parts are loosely coupled; at the same time, code does not get duplicated in each subsystem. New requirements can usually be incorporated by updating a single component instead of making large changes across the whole system, at the same time understanding that such a system requires less thought, as we can reason about its parts in isolation.

1.3.5. Readability

Code is read many more times than it is written. Typing makes it clear what a function expects from its arguments, what the prerequisites for a generic algorithm are, what interfaces a class implements, and so on. This information is valuable because we can reason about readable code in isolation: just by looking at a definition, we should be able to easily understand how the code is supposed to work without having to navigate the sources to find callers and callees.

Naming and comments are important parts of this, too, but typing adds another layer of information, as it allows us to name constraints. Let’s look at an untyped find() function declaration in the following listing.

Listing 1.11. Untyped find()

declare function find(range: any, pred: any): any;

Just looking at this function, it’s hard to tell what kind of arguments it expects. We need to read the implementation, pass in our best guess, and see whether we get a run-time error or hope that the documentation covers this.

Contrast the following code with the previous declaration.

Listing 1.12. Typed `find()`

declare function first<T>(range: T[],
    p: (elem: T) => boolean): T | undefined;

Reading this declaration, we see that for any type T, we need to provide an array T[] as the range argument and a function that takes a T and returns a boolean as the -p argument. We can also immediately see that the function is going to return a T or -undefined.

Instead of having to find the implementation or look up the documentation, just reading this declaration tells us exactly what type of arguments to pass and reduces our cognitive load, as we can treat it as a self-contained, separate entity. Having such type information explicit, available not only to the compiler but also to the developer, makes understanding the code a lot easier.

Most modern languages provide some level of type inference, which means deducing the type of a variable based on context. This is useful, as it saves us redundant typing, but becomes a problem when the compiler can understand the code easily while it becomes too effortful for people to do so. A spelled-out type is much more valuable than a comment, as it is enforced by the compiler.

1.4. Types of type systems

Nowadays, most languages and run times provide some form of typing. We realized long ago that being able to interpret code as data and data as code can lead to catastrophic results. The main distinction between contemporary type systems lies in when types get checked and how strict the checks are.

With static typing, type checking is performed at compile time, so when compilation is done, the run-time values are guaranteed to have correct types. Dynamic typing, on the other hand, defers type checking to the run time, so type mismatches become run-time errors.

Strong typing does few if any implicit type conversions, whereas weaker type systems allow more implicit type conversions.

1.4.1. Dynamic and static typing

JavaScript is dynamically typed, and TypeScript is statically typed. In fact, TypeScript was created to add static type checking to JavaScript. Converting what would otherwise be run-time errors to compilation errors, especially in large applications, makes code more maintainable and resilient. This book focuses on static typing and statically typed languages, but it’s good to understand the alternative.

Dynamic typing does not impose any typing constraints at compile time. The colloquial name duck typing comes from the phrase “If it waddles like a duck and it quacks like a duck, it must be a duck.” Code can attempt to freely use a variable in any way it wants, and typing is applied by the run time. We can simulate dynamic typing in TypeScript by using the any keyword, which allows untyped variables.

We can implement a quacker() function that takes a duck argument of type any and calls quack() on it. As long as we pass it an object that has a quack() method, everything works. If, on the other hand, we pass something that can’t quack(), we get a run-time TypeError, as shown in the following listing.

Listing 1.13. Dynamic typing

function quacker(duck: any) {                                 1
    duck.quack();
}

quacker({ quack: function () { console.log("quack"); } });    2
quacker(42);                                                  3

1 The function takes an argument of type any, so it bypasses compile-time type checking.
2 We pass an object with a quack() method, so the call prints “quack.”
3 This causes a run-time error: TypeError: duck.quack is not a function.

Static typing, on the other hand, performs type checks at compile time, so attempting to pass an argument of the wrong type causes a compilation error. To leverage the static typing features of TypeScript, we can update the code by declaring a Duck interface and properly typing the function’s argument, as shown in listing 1.14. Note that in TypeScript, we do not have to explicitly declare that we are implementing the Duck interface. As long as we provide a quack() function, the compiler considers the interface to be implemented. In other languages, we would have to be explicit by declaring a class as implementing the interface.

Listing 1.14. Static typing

interface Duck {                                           1
    quack(): void;
}

function quacker(duck: Duck) {                             2
    duck.quack();
}

quacker({ quack: function () { console.log("quack"); } });
quacker(42);                                               3

1 Interface declaration for an object we expect has a quack() method
2 Updated function now requires an argument of type Duck.
3 Compile error: Argument of type ‘42’ is not assignable to parameter of type ‘Duck’.

Catching these types of errors at compile time, before they can cause a running program to malfunction, is the key benefit of static typing.

1.4.2. Weak and strong typing

We often hear the terms strong typing and weak typing to describe a type system. The strength of a type system describes how strict the system is with regard to enforcing type constraints. A weak type system implicitly tries to convert values from their actual types to the types expected when the value is used.

Consider this question: Does milk equal white? In a strongly typed world, no, milk is a liquid, and it makes no sense to compare it to a color. In a weakly typed world, we can say, “Well, milk’s color is white, so yes, it does equal white.” In the strongly typed world, we can explicitly convert milk to a color by making the question more explicit: Does the color of milk equal white? In the weakly typed world, we don’t need this refinement.

JavaScript is weakly typed. We can see this by using the any type in TypeScript and deferring to JavaScript to handle typing at run time. JavaScript provides two equality operators: ==, which checks whether two values are equal, and ===, which checks both that the values and the type of the values are equal, as shown in the next listing. Because JavaScript is weakly typed, an expression such as "42" == 42 evaluates to true. This is surprising, because "42" is text, whereas 42 is a number.

Listing 1.15. Weak typing

const a: any = "hello world";
const b: any = 42;

console.log(a == b);        1

console.log("42" == b);     2

console.log("42" === b);    3

1 Prints “false,” though comparing a string to a number is allowed.
2 Prints “true”; the JavaScript run time implicitly converts the values to the same type.
3 Prints “false”; the === operator also compares the types.

Implicit type conversions are handy in that we don’t have to write more code to explicitly convert between types, but they are dangerous because in many cases we do not want conversions to happen and are surprised by the results. TypeScript, being strongly typed, doesn’t compile any of the preceding comparisons when we properly declare a to be a string and b to be a number, as the following listing shows.

Listing 1.16. Strong typing

const a: string =c"hello world";    1
const b: number = 42;               1

console.log(a == b);                2

                                    2
console.log("42" == b);             2

                                    2
console.log("42" === b);            2

1 a and b are no longer declared as any, so they get type checked.
2 All three comparisons fail to compile, as TypeScript doesn’t allow comparing different types.

All the comparisons now cause the error "This condition will always return 'false' since the types 'string' and 'number' have no overlap". The type checker determines that we are trying to compare values of different types and rejects the code.

Although a weak type system is easier to work with in the short term, as it doesn’t force programmers to explicitly convert values between types, it does not provide the same guarantees we get from a stronger type system. Most of the benefits described in this chapter and the techniques employed in the rest of this book lose their effectiveness if they are not properly enforced.

Note that although a type system is either dynamic (type checking at run time) or static (type checking at compile time), its strength lies on a spectrum: the more implicit conversions it performs, the weaker it is. Most type systems, even strong ones, do provide some limited implicit casting for conversions that are deemed safe. A common example is conversions to boolean: if (a) in most languages would compile even if a is a number or a reference type. Another example is widening casts, which we’ll cover in detail in chapter 4. TypeScript uses only the number type to represent numeric values, but in languages in which, for example, we need a 16-bit integer but pass in an 8-bit integer, the conversion is usually done automatically, as there is no risk of data corruption. (A 16-bit integer can represent any value that an 8-bit integer can, and more.)

1.4.3. Type inference

In some cases, the compiler can infer the type of a variable or a function without us having to specify it explicitly. If we assign the value 42 to a variable, for example, the TypeScript compiler can infer that its type is number, so we don’t need to provide the type notations. We can do so if we want to be explicit and make the type clear to readers of the code, but the notation is not strictly required.

Similarly, if a function returns a value of the same type on each return statement, we don’t need to spell out its return type explicitly in the function definition. The compiler can infer it from the code, as shown in the next listing.

Listing 1.17. Type inference

function add(x: number, y: number) {    1
    return x + y;
}

let sum = add(40, 2);                   2

1 The function does not have an explicit return type, but the compiler infers it as number.
2 The type of the variable sum is not explicitly declared as number; rather, it is inferred.

Unlike dynamic typing, in which typing is performed only at run time, in these cases the typing is still determined and checked at compile time, but we don’t have to supply it explicitly. If typing is ambiguous, the compiler will issue an error and ask us to be more explicit by providing type notations.

1.5. In this book

A strong, static type system enables us to write code that is more correct, more composable, and more readable. This book will cover common features of such modern type systems with a focus on practical applications of these features.

We’ll start with primitive types, the out-of-the-box types available in most languages. We’ll cover using them correctly and avoiding some common pitfalls. In some cases, we show how to implement some of these types if your particular language does not provide them natively.

Next, we’ll look at composition and how primitive types can be put together to build a large universe of types supporting your particular problem domain. There are multiple ways to combine types, so you’ll learn how to pick the right tool for the job depending on the particular problem you are trying to solve.

Then we will cover function types and the new implementations that open to us when a type system can type functions and treat them as regular values. Functional programming is a very deep topic, so instead of attempting to explain it fully, we’ll borrow a set of useful concepts and apply them to a nonfunctional language to solve real-world problems.

The next step in the evolution of type systems, after being able to type values, compose types, and type functions, is subtyping. We’ll go over what makes a type a subtype of another type and see how we can apply some object-oriented programming concepts to our code. We’ll discuss inheritance, composition, and the less-traditional mix-ins.

We’ll continue with generics, which enable type variables and allow us to parameterize code on types. Generics open a whole new level of abstraction and composability, decoupling data from data structures, data structures from algorithms, and enabling adaptive algorithms.

Last, we’ll cover higher kinded types, which are the next level of abstraction, parameterizing generic types. Higher kinded types formalize data structures such as monoids and monads. Many programming languages do not support higher kinded types today, but their extensive use in languages such as Haskell and increasing popularity will eventually lead to their adoption across more established languages.

Summary

A type is a classification of data that defines the operations that can be done on that data, the meaning of the data, and the set of allowed values.
A type system is a set of rules that assigns and enforces types to elements of a programming language.
Types restrict the range of values a variable can take, so in some cases, what would’ve been a run-time error becomes a compile-time error.
Immutability is a property of the data enabled by typing, which ensures that values don’t change when they’re not supposed to.
Visibility is another type-level property that determines which components are allowed to access which data.
Generic programming enables powerful decoupling and code reuse.
Type notations make code easier to understand for readers of the code.
Dynamic typing (or duck typing) determines types at run time.
Static typing checks types at compile time, catching type errors that otherwise would’ve become run-time errors.
The strength of a type system is a measure of how many implicit conversions between types are allowed.
Modern type checkers have powerful type inference algorithms that enable them to determine the types of variables, functions, and so on without your having to write them out explicitly.

In chapter 2, we will look at primitive types, which are the building blocks of the type system. We’ll learn how to avoid some common mistakes that arise when using these types and see how we can build almost any data structure from arrays and references.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 1. Introduction to typing

Create new playlist

Sign In

Sign Up

Chapter 1. Introduction to typing

1.1. Whom this book is for

1.2. Why types exist

Listing 1.1. Trying to interpret data as code

1.2.1. 0s and 1s

Figure 1.1. A sequence of bits can be interpreted in multiple ways.

Figure 1.2. The sequence of bits typed as a signed 16-bit integer. The type information (16-bit signed integer) tells the compiler and/or run time that the sequence of bits represents an integer value between -32768 and 32767, ensuring the correct interpretation as -15709.

1.2.2. What are types and type systems?

Type

Type System

Figure 1.3. Source code is transformed by a compiler or interpreter into code that can be executed by a run time. The run time is a physical computer or a virtual machine, such as Java’s JVM, or a browser’s JavaScript engine.

Type Checking

1.3. Benefits of type systems

1.3.1. Correctness

Listing 1.2. Insufficient type information

Listing 1.3. Refined type information

Figure 1.4. Declaring a type correctly, we can disallow invalid values. The first type is too loose and allows for values we don’t want. The second, more restrictive type won’t compile if the code tries to assign an unwanted value to a variable.

1.3.2. Immutability

Listing 1.4. Bad mutation

Listing 1.5. Immutability

1.3.3. Encapsulation

Listing 1.6. Not enough encapsulation

Listing 1.7. Encapsulation

1.3.4. Composability

Listing 1.8. Noncomposable system

Listing 1.9. Noncomposable system update

Listing 1.10. Composable system

1.3.5. Readability

Listing 1.11. Untyped find()

Listing 1.12. Typed find()

1.4. Types of type systems

1.4.1. Dynamic and static typing

Listing 1.13. Dynamic typing

Listing 1.14. Static typing

1.4.2. Weak and strong typing

Listing 1.15. Weak typing

Listing 1.16. Strong typing

1.4.3. Type inference

Listing 1.17. Type inference

1.5. In this book

Summary

Table of Contents for
Chapter 1. Introduction to typing

Figure 1.2. The sequence of bits typed as a signed 16-bit integer. The type information (16-bit signed integer) tells the compiler and/or run time that the sequence of bits represents an integer value between `-32768` and `32767`, ensuring the correct interpretation as `-15709`.

Listing 1.12. Typed `find()`