ReScript has two first-class data structures that we can use to represent ordered collections of data: arrays and lists. Arrays are almost identical to JavaScript arrays, while lists are linked lists with immutable contents. In this chapter, we’ll go over syntax and examples for lists and arrays, and discuss differences and use cases for each one. We’ll also cover three important higher-order functions that can be used with collections: map, filter, and reduce.
Arrays
ReScript’s arrays will feel very familiar to readers coming from a JavaScript background. Arrays look and feel almost exactly the same as arrays in JavaScript, and they are an important part of interoperability between ReScript and JavaScript because ReScript arrays are compiled to JavaScript arrays. This means that arrays returned from JavaScript libraries can be directly used in ReScript, and we can pass ReScript arrays directly into JavaScript functions that expect arrays as inputs.
ReScript arrays require all elements to be the same type.
ReScript arrays do not have methods; instead, we can operate on arrays with standard library functions that accept the array as an argument.
Array Standard Library
As with strings, standard library functions for arrays can be found in several places. For newcomers with a background in JavaScript, I recommend using Js.Array2 because the functions in that library correspond directly to JavaScript’s array standard library. Additional functions for arrays can be found in Belt.Array.
Js.Array2 contains many common array operations, written as functions instead of methods. The functions in the two libraries I mentioned are all written “data first,” meaning that the array is passed in as the first argument to the function. This allows us to take advantage of the pipe operator and chain function calls the same way we would chain method calls in JavaScript.
Note on Accessing Arrays
One thing to be careful of: in ReScript, the array access syntax a[b] is just shorthand for Array.get(a, b). This means that its behavior depends on the definition of Array.get in our current scope.
There is one important case where this comes up – the implementation and type signature of Array.get differs between Js.Array2 and Belt.Array.
The default behavior for array accesses will throw if we try to access an index that’s out-of-bounds. In contrast, the Belt standard library is designed to protect against out-of-bounds exceptions. Therefore, Belt.Array.get returns an optional value: if the index is out-of-bounds, it returns None; otherwise, it returns Some(v) where v is the value at that index.
This means that when we use open Belt in some scope, all the array accesses in that scope will need to be unwrapped explicitly.
Higher-Order Functions for Arrays
One of the most common patterns for arrays is looping through the array to calculate some result, whether it’s building up a new array or calculating a single value. Higher-order functions allow us to perform these operations on arrays without having to write loops or update variables. This makes array operations easier to write and reduces potential mistakes from mutating variables.
To illustrate what I mean, let’s go over the most important higher-order functions for operating on arrays: map, filter, and reduce.
Map
Map is for situations where we want to make a new array by applying some operation to every value of the input array.
The collection – In our case, an array.
The mapping function – This is a helper function that is applied to each element of the input array. The outputs of this function are stored in a new array and returned from the whole map operation.
Map is named as such because the helper function defines a mapping from the input array’s values to the new array’s values. The function is applied to each value individually, and the order and number of elements in the new array is the same as the original array.
For example, a diagram of the map operation to double the values of an array would look something like this:
As you can see, the version that uses higher-order functions is much cleaner – we just pass a helper function to map the value of each element, and we don’t need to worry about setting up the loop or building the new array.
The signature for Js.Array2.map is (array<'a>, 'a => 'b) => array<'b>, where 'a is the type of the input array’s contents and 'b is the type of the output array’s contents.
Filter
Filter returns a new array containing only the elements of the input array that fulfill a certain condition – for example, getting only the even numbers from an array of integers.
Filter takes in two arguments:
The collection – In our case, an array.
The predicate – This is a helper function that gets called on each element of the array and returns a boolean. The resulting array will only include the elements of the original array for which this function returns true.
The condition is applied to each value individually from left to right, and the resulting array’s elements are in the same order as they were in the input array.
To illustrate, here’s a diagram showing how filtering only the even numbers from the array [1, 2, 3, 4, 5] would work:
The signature for Js.Array2.filter is (array<'a>, 'a => bool) => array<'a>.
Again, you can see that the loop version is much harder to read compared to using the higher-order filter function.
Reduce
The reduce operation is named as such because it takes a collection of values and reduces it to a single value. The resulting value can be anything – pairs, objects, even another array.
Under the hood, the reduce function actually works just like the for-loop – it just handles the boilerplate of iterating through the array and tracking the accumulated value for you. Here’s how:
The collection – In our case, the array.
The reducer function – This helper function takes in the accumulator value and an element in the array, and returns a new accumulator value. It repeats this process for each element, returning the final result when it reaches the end of the array.
The initial value – This is the initial value for the accumulator used when the reducer is called for the first time, or, if the collection is empty, this value is returned from the reduce operation.
Note Js.Array2.reduce has a different order of arguments compared to Belt.Array.reduce. The former takes in the array, reducer function, and initial value in that order, while the latter takes in the initial value before the reducer function.
- 1.
The initial accumulator value is 0. The reducer function is called with the accumulator value and next element of the array (1). The result is 1.
- 2.
The accumulator value is 1. The reducer function is called with the accumulator value and next element of the array (2). The result is 3.
- 3.
The accumulator value is 3. The reducer function is called with the accumulator value and next element of the array (3). The result is 6 – since there are no more elements in the array, this value is returned from the reduce function.
After seeing the previous example, you may be tempted to implement very complex operations as a single call to reduce. Do not do this. The point of these higher-order functions is to allow programmers to break down complex operations into smaller, more composable pieces. Grouping everything into a single reduce operation does just the opposite, increasing complexity and making code harder to read, making it just as bad (or even worse) than using a single massive loop.
Although reduce is a very general operation that can be used for nearly everything, that doesn’t mean that it should be used for everything. Compared to more specialized functions like map and filter, reduce is more verbose and less readable. When a more specialized function exists, we should use that instead of reduce.
Composing Higher-Order Functions
Oftentimes, complex operations on a collection can be broken down into a series of map/filter/reduce operations that each does a single operation on the data. This keeps our code clean and organized, since now the code is just piping data through a series of simple transformations. We don’t need to worry about conditions for loops or maintaining any outside state, and each simple operation in our pipeline is easy to understand and write, reducing the risk of bugs.
Higher-Order Functions in Action
Calculate the baggage fee of each item.
Filter out overweight luggage items.
Return the total weight and total cost of the remaining items.
Now that our luggage is priced and filtered, we can calculate the total weight and cost by using Js.Array2.reduce.
It’s possible (and slightly more efficient) to calculate both totals at the same time by reducing to a tuple, but separating the two operations keeps our code more readable.
Generalizing Higher-Order Functions
Higher-order functions are really flexible and can be applied in a wide variety of situations, because ultimately the behavior of map/filter/reduce is determined by the function that the caller provides. This means that a huge set of complex operations can be implemented by using some combination of these basic functions.
partition – Returns a tuple of two arrays, the first one holding the values that filter would have returned and the second one holding the values that filter would have filtered out.
some – Returns true if the predicate evaluates to true for any value in the array.
every – Returns true if the predicate evaluates to true for every value in the array.
reduceReverse – Like reduce, except it iterates through the array in reverse order. myArray->reduceReverse(...) is equivalent to myArray->reverse->reduce(...). In Belt.Array, this is called reduceReverse, while in Js.Array2 this is called reduceRight.
forEach – This function iterates through the array and performs some operation that returns unit on each value. The entire function returns unit, so it’s like a functional version of a for-loop. It’s useful for performing some mutation or side effects, such as printing all the values in an array one by one.
Higher-order functions can also be generalized across different types of collections. The functions I discussed here don’t just apply to arrays. As you’ll see later with lists, sets, dictionaries, and more, any collection of values can be mapped or filtered or reduced.
Many other languages also support map, filter, and reduce in their standard libraries: arrays in JavaScript and streams in Java are just a few examples of data structures that support these operations.
Lists
ReScript’s list is an immutable, singly linked list. Like other immutable data structures, there is no way to change a value in the list once it is set – instead, we can build new lists with the updates that we want to apply.
The structure of a linked list is defined recursively – every linked list is either an empty list or a node consisting of its first element (called the “head”) and a pointer to the remainder of the list (called the “tail”). A one-element list is a node pointing to an empty list, a two-element list is a node pointing to a one-element list, and so on.
Building a List
Prepending an element to a list doesn’t change the list; it creates a new list that’s the same as the old list with an extra element in front. The syntax reflects this, where again we use the list{} constructor with the element we are adding followed by a spread of the list we are adding it to.
Immutability and Lists
One of the main benefits of using lists is that immutable updates are cheap. If we wanted to update the first element of an array without losing the previous state, we would have to copy every element of the array and then apply the change to the new array – quite an expensive operation. On the other hand, immutable updates to lists are very efficient – the operation runs in constant time because it does not need to copy any data.
Prepending an element to a list just creates a new node that contains the prepended value (hd) and a pointer to the rest of the list (tl). That means that newList points to a list with the new value added, while oldList still points to the list without the new value.
You might wonder: if both lists are pointed to the same objects, what if one of the lists changes – wouldn’t that affect the other? The key to this is immutability. This type of cheap immutable update only works with immutable data like lists, since immutability means there is no way to change the data stored in an existing list.
Pattern Matching with List
In the preceding pattern, hd is bound to the head (first element) of the list and tl is bound to the tail (rest of the list). As always, these are just arbitrary names – we could have picked other names like first and rest.
We can destructure more than just the first element of the list.
Omitting the spread operator on the last part of the pattern lets us match against lists with a fixed number of elements.
When accessing lists with pattern matching, we always handle at least two fundamental cases: when the list is empty and when the list has at least one element. This lets us guarantee that list accesses are safe since we can only access elements that are actually there, unlike random indexing which can potentially cause an out-of-bounds exception.
We don’t have to write recursive pattern matching functions every time we want to do something with a list, because the standard library provides many utilities for lists. The Belt.List standard library module supports general operations like map, keep (filter), reduce, and many more, which we can compose together to perform complex computations on lists, just like how we did with arrays.
Higher-Order Functions with Lists
While pattern matching is powerful and gives us a lot of power to implement custom logic, most of the time we don’t have to use pattern matching and recursion to work with lists. Just like arrays, lists also support the higher-order functions map, keep (filter), and reduce via the Belt.List standard library. There are slight differences in naming and argument order, but aside from that usage is virtually identical to arrays.
To demonstrate, let’s implement the luggage example using lists. Recall our goal was to write a processLuggage function that takes in a list of owner + weight tuples and calculates the total weight and cost of all luggage items weighing less than 200lb.
Drawbacks of Lists
Because ReScript’s lists are modeled as singly linked lists, reading or inserting at any position besides the head is inefficient because we can only traverse the list one element at a time starting from the head. Although there are standard library functions that let us read or update arbitrary positions in the list, these operations are slow and expensive compared to their array equivalents.
Traversing the entire list each time we add a value to the end is inefficient, especially when we have a large list or want to perform the operation many times.
Since lists compile to nested JavaScript objects, the output is less readable than arrays when printed using Js.log. When debugging or serializing data in lists, it’s best to convert them to arrays using Belt.List.toArray.
Use Cases for Immutable Collections
For some readers, immutable data structures may be a relatively foreign concept. Why would we care about immutability, and what benefits does immutability bring to our programs?
Having immutability prevents errors caused by unexpected mutations. With immutable data structures like lists, once we create a list with some particular contents, we know that the original list will never change. We can pass the list around to various functions and not have to worry about those functions adding, removing, or changing elements in the list.
Immutability makes program state much easier to reason about. If we represent the state of the program with a mutable data structure, then the old state information is lost whenever the state changes, unless we explicitly make a copy of the data. Working with immutable data structures, we can easily write functions that take in a state and output a new state without changing the original. This lets us compare the old state and the new state, making it easy to validate and test state transitions or even roll back the state to the previous version.
Here’s an example: say we’re coding a board game, and we want to check to see if a move is legal. With an immutable state, we can create the board state after the move and validate the board, without affecting the current game state. This is also useful is in programs with multiple steps – preserving past states allows users to quickly undo commands if something goes wrong. To accomplish the same thing with a mutable state, we would need to make a deep copy of the state whenever anything changes. Doing so would be error-prone and terrible for performance, so in those situations immutability really shines.
Immutability goes hand in hand with purity – since a function that mutates any state outside of itself is considered impure, using immutable data structures makes it easier to write pure code. A function that uses or manipulates mutable external states or mutable inputs may have inconsistent behavior when called with the same input multiple times. Testing these functions would require tracking and resetting the state for each test case, and developers would need to consider much more than just the inputs of the function when making changes. On the other hand, a function that does not rely on any mutable state will always return the same thing when called with the same inputs, making it much easier to modify and test.
It’s important to keep in mind that at the end of the day that immutability is not some special feature of the computer (after all, the memory of the computer is mutable); immutability is an abstraction at the programming language level. If we pass a ReScript list to a JavaScript function, the data can be mutated since it’s just an object, and there is nothing stopping us from mutating objects in JavaScript. The safety and immutability guarantees are a feature of the language and type system that we’re using.
Lists vs. Arrays
When should I use an array and when should I use a list? Arrays are more generally useful and versatile, while lists are less flexible but provide additional safety guarantees through pattern matching and immutability.
For a lot of common use cases like iterating through a collection or performing common operations like map and filter, either data structure can be used.
There will be a lot of elements, and performance is a concern.
You want to read/update/add elements at arbitrary positions.
The data will be passed between JavaScript and ReScript code.
You want immutability, or your data is read-only.
You want better pattern matching and type safety.
Your data will only be used within ReScript code.
If your program’s requirements do not fit cleanly in one category, it is also possible to convert between lists and arrays, but there is a performance cost if this is done too frequently.
Final Thoughts
While ReScript’s lists may seem like a strange data structure, they are very powerful because they offer immutable collections built into the language. With their recursive structure, they are a natural fit for ReScript’s powerful pattern matching, allowing us to write cleaner and more elegant code than we could with arrays.
The choice of using lists or arrays ultimately comes down to how we want to write our programs. If we want to write code in a more functional style and want the benefits of immutability, lists are the natural choice. If we want our code to look and feel more like JavaScript, then arrays are the better choice.
In the next chapter, we’ll discuss other collections in ReScript’s standard libraries: sets, maps, stacks, and queues.