We discussed recursion, which help us to iterate sequences easily, in the previous chapter. Furthermore, we need to discuss about optimizing code since it is an essential technique if we want to develop a good program. In a functional approach, we can use laziness and caching techniques to make our code efficient so that it will run faster. By discussing laziness and caching techniques, we will be able to develop efficient code. To learn more about laziness and caching techniques, in this chapter, we will discuss the following topics:
When we talk about being lazy in our daily activity, we might think about something we don't do but we actually have to do. Or, we might put off doing something just because we are lazy. In functional programming, laziness is analogous to our laziness in daily activities. The execution of particular code is deferred due to the concept of laziness thinking. In Chapter 5, Querying Any Collection Easily with LINQ we mentioned that LINQ implemented deferred execution when querying data from a collection.
The query will be executed only when it's enumerated. Now, let's discuss the laziness concept we can use in the functional approach.
In the .NET framework, there are some techniques to enumerate a collection of data, such as array and List<T>
. However, implicitly, they are eager evaluations since in an array, we have to define its size first and then fill in the allocated memory before we use it. List<T>
has a similar concept compared to array. It adopts the array mechanism. The difference between these two enumeration techniques is that we can easily expand the size in List<T>
rather than array.
In contrast, the .NET framework has IEnumerable<T>
to enumerate data collection, and fortunately, it will be evaluated lazily. Actually, array and List<T>
implement the IEnumerable<T>
interface, but because it has to be filled by data it has to be eagerly evaluated. We used this IEnumerable<T>
interface when we dealt with LINQ in Chapter 5, Querying Any Collection Easily with LINQ.
The IEnumerable<T>
interface implements the IEnumerable
interface, which is defined like this:
public interface IEnumerable<out T> : IEnumerable
There is only a single method that the IEnumerable<T>
interface has: GetEnumerator()
. The definition of this method is similar to what is shown in the following code:
IEnumerator<T> GetEnumerator()
As you can see, the GetEnumerator()
method returns the IEnumerator<T>
data type. This type has only three methods and a single property. Here are the methods and properties it has:
Current
: This is a property that stores the element of the collection for the current position of the enumerator.Reset()
: This is a method that sets the enumerator to the initial position, which is before the first element of the collection. The index of the initial position is usually -1 (minus one).MoveNext()
: This is a method to move the enumerator to the next collection element.Dispose()
: This is a method to free, release, or reset unmanaged resources. It's inherited from the IDisposable
interface.Now, let's play with the Fibonacci algorithm, which will generate infinite numbers. The algorithm will generate sequences by adding the previous two elements. In mathematical terms, the formula can be defined as follows:
Fn = Fn-1 + Fn-2
The first two numbers for the calculation of this algorithm can be 0 and 1 or 1 and 1.
Using this algorithm, we are going to prove that the IEnumerable
interface is a lazy evaluation. So, we create a class named FibonacciNumbers
, which implements the IEnumerable<Int64>
interface that we can find in the LazyEnumeration.csproj
project, as shown in the following code:
public partial class Program { public class FibonacciNumbers : IEnumerable<Int64> { public IEnumerator<Int64> GetEnumerator() { return new FibEnumerator(); } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } } }
Since the FibonacciNumbers
class implements the IEnumerable<T>
interface, it has the GetEnumerator()
method, as we discussed earlier, to enumerate the data collection. And because the IEnumerable<T>
interface implements the IEnumerator<T>
interface, we create the FibEnumerator
class, as shown in the following code:
public partial class Program { public class FibEnumerator : IEnumerator<Int64> { public FibEnumerator() { Reset(); } // To get the current element public Int64 Current { get; private set; } // To get the last element Int64 Last { get; set; } object IEnumerator.Current { get { return Current; } } public void Dispose() { ; // Do Nothing } public bool MoveNext() { if (Current == -1) { // Fibonacci algorithm // F0 = 0 Current = 0; } else if (Current == 0) { // Fibonacci algorithm // F1 = 1 Current = 1; } else { // Fibonacci algorithm // Fn = F(n-1) + F(n-2) Int64 next = Current + Last; Last = Current; Current = next; } // It's never ending sequence, // so the MoveNext() always TRUE return true; } public void Reset() { // Back to before first element // which is -1 Current = -1; } } }
Now, we have the FibEnumerator
class that implements the IEnumerator<T>
interface. Since the class implements IEnumerator<T>
, it has the Reset()
, MoveNext()
, and Dispose()
methods, as we have discussed already. It also has the Current
property from the implementation of the IEnumerator<T>
interface. We add the Last
property to save the last current number.
Now, it's time to create the caller to instantiate the FibonacciNumbers
class. We can create the GetFibonnacciNumbers()
function, which has an implementation similar to what is shown in the following code:
public partial class Program { private static void GetFibonnacciNumbers( int totalNumber) { FibonacciNumbers fibNumbers = new FibonacciNumbers(); foreach (Int64 number in fibNumbers.Take(totalNumber)) { Console.Write(number); Console.Write(" "); } Console.WriteLine(); } }
Because the FibonacciNumbers
class will enumerate infinite numbers, we have to use the Take()
method, as shown in the following code snippet, in order not to create an infinite loop:
foreach (Int64 number in fibNumbers.Take(totalNumber))
Suppose we need to enumerate 40 numbers from the sequence; we can pass 40 as an argument to the GetFibonnacciNumbers()
function, as follows:
GetFibonnacciNumbers(40)
And if we run the preceding function, we will get the following output on the console:
We can get the preceding output on the console since IEnumerable
is a lazy evaluation. This is because the MoveNext()
method will only be called to calculate the result if it's asked to do so. Imagine if it's not lazy and is always called; then, our previous code will spin and result in an infinite loop.
The simple example we have on lazy evaluation is when we deal with two Boolean statements and need to compare them. Let's take a look at the following code, which demonstrates lazy evaluation, which we can find in the SimpleLazyEvaluation.csproj
project:
public partial class Program { private static MemberData GetMember() { MemberData member = null; try { if (member != null || member.Age > 50) { Console.WriteLine("IF Statement is TRUE"); return member; } else { Console.WriteLine("IF Statement is FALSE"); return null; } } catch (Exception e) { Console.WriteLine("ERROR: " + e.Message); return null; } } }
And here is the MemberData
class we use in the preceding code:
public class MemberData { public string Name { get; set; } public string Gender { get; set; } public int Age { get; set; } }
And if we run the preceding GetMember()
method, we will get the following output on the console:
As we know, when we use the ||
(OR) operator in the Boolean expression, it will result in TRUE
if at least one expression is TRUE
. Now take a look at the following code snippet:
if (member != null || member.Age > 50)
In the preceding example, when the compiler finds that member != null
is FALSE
, it then evaluates the other expression, which is member.Age > 50
. Since the member is null, it has no Age
property; so, when we try to access this property, it will throw an exception.
Now, let's refactor the preceding code snippet into the following code using the &&
(AND) operator:
if (member != null && member.Age > 50)
The complete method named GetMemberANDOperator()
will be as follows:
public partial class Program { private static MemberData GetMemberANDOperator() { MemberData member = null; try { if (member != null && member.Age > 50) { Console.WriteLine("IF Statement is TRUE"); return member; } else { Console.WriteLine("IF Statement is FALSE"); return null; } } catch (Exception e) { Console.WriteLine("ERROR: " + e.Message); return null; } } }
If we run the preceding GetMemberANDOperator()
method, we will get the following output on the console:
Now, the if
statement has been successfully executed and it results in FALSE
after being evaluated. However, the member.Age > 50
expression is never evaluated in this case so that the exception is not thrown. The reason why the member.Age > 50
expression is not evaluated is that the compiler is too lazy to do it since the first expression, member != null
, is FALSE
and the result of this &&
logical operation will result in FALSE
regardless of the result of other expression. We can now say that laziness is ignoring another expression when it can decide the result using only one expression.
Some people may think that lazy evaluation is synonymous with nonstrict evaluation. However, it's not actually synonymous since the evaluation of a particular expression will be ignored if it's not needed in the lazy evaluation, while the reduction of evaluation will be applied in a nonstrict evaluation. Let's take a look at the following code in order to distinguish strict and nonstrict evaluation, which we can find in the NonStrictEvaluation.csproj
project:
public partial class Program { private static int OuterFormula(int x, int yz) { Console.WriteLine( String.Format( "Calculate {0} + InnerFormula({1})", x, yz)); return x * yz; } private static int InnerFormula(int y, int z) { Console.WriteLine( String.Format( "Calculate {0} * {1}", y, z )); return y * z; } }
In the preceding code, we are going to calculate the formula of x + (y * z)
. The InnerFormula()
function will calculate the multiplication of y
and z
, while the OuterFormula()
function will calculate the addition of x
and the result of y * z
. When evaluating the formula in strict evaluation, we first calculate the (y * z)
expression to retrieve the value and then add the result to the x
. The code will be like the following StrictEvaluation()
function:
public partial class Program { private static void StrictEvaluation() { int x = 4; int y = 3; int z = 2; Console.WriteLine("Strict Evaluation"); Console.WriteLine( String.Format( "Calculate {0} + ({1} * {2})",x, y, z)); int result = OuterFormula(x, InnerFormula(y, z)); Console.WriteLine( String.Format( "{0} + ({1} * {2}) = {3}",x, y, z, result)); Console.WriteLine(); } }
As you can see, in the preceding code we invoke the OuterFormula()
function like the following code snippet:
int result = OuterFormula(x, InnerFormula(y, z));
And for the strict evaluation we discussed earlier, the output we get on the console will be as follows:
As you can see in the preceding figure, when we calculate 4 + (3 * 2)
, we first calculate the result of (3 * 2)
and then after getting the result, we add it to 4
.
Now, let's compare it with nonstrict evaluation. In nonstrict evaluation, the +
operator is reduced first and then we reduce the inner formula, which is (y * z)
. We will see that the evaluation will be started from the outside to the inside. Now let's refactor the preceding OuterFormula()
function to the OuterFormulaNonStrict()
function, as shown in the following code:
public partial class Program { private static int OuterFormulaNonStrict( int x, Func<int, int, int> yzFunc) { int y = 3; int z = 2; Console.WriteLine( String.Format( "Calculate {0} + InnerFormula ({1})", x, y * z )); return x * yzFunc(3, 2); } }
As you can see, we modify the second parameter of the function into the Func<int, int, int>
delegate. We will call OuterFormulaNonStrict()
from the NonStrictEvaluation()
function, as follows:
public partial class Program { private static void NonStrictEvaluation() { int x = 4; int y = 3; int z = 2; Console.WriteLine("Non-Strict Evaluation"); Console.WriteLine( String.Format( "Calculate {0} + ({1} * {2})",x, y, z)); int result = OuterFormulaNonStrict(x, InnerFormula); Console.WriteLine( String.Format( "{0} + ({1} * {2}) = {3}",x, y, z, result)); Console.WriteLine(); } }
In the preceding code, we can see that we pass the InnerFormula()
function into the second parameter of the OuterFormulaNonStrict()
function, as shown in the following code snippet:
int result = OuterFormulaNonStrict(x, InnerFormula);
The expression in the preceding code snippet will be evaluated using nonstrict evaluation. To prove this, let's run the NonStrictEvaluation()
function, and we will get the following output on the console:
We can see that our expression is evaluated from the outside to the inside. The OuterFormulaNonStrict()
function is run first even though the result of the InnerFormula()
function is not retrieved yet. And if we run the OuterFormula()
function and the OuterFormulaNonStrict()
function consecutively, we will get a clear difference in the order of the evaluation, as shown in the following output screenshot:
Now, we can compare the difference. In strict evaluation, the calculation of (3 * 2)
is run first and then fed to the (4 + InnerFormula())
expression, while in nonstrict evaluation, the (4 + InnerFormula())
expression is run before the calculation of (3 * 2)
.
Lazy initialization is an optimizing technique where the object creation is deferred until it is used. It means that we can define an object but it won't be initialized if the member of the object is not accessed yet. C# introduced the Lazy<T>
class in C# 4.0, and we can use to lazily initialize the object. Now, let's take a look at the following code in order to demonstrate the lazy initialization that we can find in the LazyInitialization.csproj
project:
public partial class Program { private static void LazyInitName(string NameOfPerson) { Lazy<PersonName> pn = new Lazy<PersonName>( () => new PersonName(NameOfPerson)); Console.WriteLine( "Status: PersonName has been defined."); if (pn.IsValueCreated) { Console.WriteLine( "Status: PersonName has been initialized."); } else { Console.WriteLine( "Status: PersonName hasn't been initialized."); } Console.WriteLine( String.Format( "Status: PersonName.Name = {0}", (pn.Value as PersonName).Name)); if (pn.IsValueCreated) { Console.WriteLine( "Status: PersonName has been initialized."); } else { Console.WriteLine( "Status: PersonName hasn't been initialized."); } } }
We define the PersonName
class as follows:
public class PersonName { public string Name { get; set; } public PersonName(string name) { Name = name; Console.WriteLine( "Status: PersonName constructor has been called." ); } }
As you can see in the preceding LazyInitName()
function implementation, we lazily initialize the PersonName
object using the Lazy<T>
class, as shown in the following code snippet:
Lazy<PersonName> pn = new Lazy<PersonName>( () => new PersonName(NameOfPerson));
By doing this, PersonName
isn't actually initialized after we define the pn
variable, like what we usually get when we define the class directly using the following code:
PersonName pn = new PersonName( NameOfPerson);
Instead, using lazy initialization we access the object's member in order to initialize it, as discussed earlier. Lazy<T>
has a property named Value
, which gets the value for the Lazy<T>
instance. It also has the IsValueCreated
property to indicate whether a value has been created for this Lazy<T>
instance. In the LazyInitName()
function, we use the Value
property, as shown in the following code snippet:
Console.WriteLine( String.Format( "Status: PersonName.Name = {0}", (pn.Value as PersonName).Name));
We use (pn.Value as PersonName).Name
to access the Name
property of the PersonName
class, which has been instantiated by the pn
variable. We use the IsValueCreated
property to prove whether or not the PersonName
class has been initialized, as shown in the following code snippet:
if (pn.IsValueCreated) { Console.WriteLine( "Status: PersonName has been initialized."); } else { Console.WriteLine( "Status: PersonName hasn't been initialized."); }
Now let's run the LazyInitName()
function and pass Matthew Maxwell
as its argument, as shown in the following code:
LazyInitName("Matthew Maxwell");
We will get the following output on the console:
From the preceding screenshot, we have five lines of information. The first line we get is when PersonName
is defined. We then check the value of the IsValueCreated
property to find whether PersonName
has been initialized. We get the FALSE
result, which means that it's not initialized yet; so we have the second line of the information on the console. The next two lines are the interesting things we get from lazy initialization. When we access the Value
property of the Lazy<T>
class to retrieve the Name
property of the PersonName
instance, the code calls the constructor of PersonName
before it accesses the Name
property of PersonName
class. This is why we have lines 3 and 4 in the preceding console. And after we check the IsValueCreated
property again, we find that PersonName
has now been initialized and the pn
variable has an instance of PersonName
.
We have learned about laziness so far. We can also elaborate the advantages of being lazy, such as:
Besides the advantages, being lazy also has disadvantages, such as: