Chapter 8. Optimizing the Code using Laziness and Caching Techniques

We discussed recursion, which help us to iterate sequences easily, in the previous chapter. Furthermore, we need to discuss about optimizing code since it is an essential technique if we want to develop a good program. In a functional approach, we can use laziness and caching techniques to make our code efficient so that it will run faster. By discussing laziness and caching techniques, we will be able to develop efficient code. To learn more about laziness and caching techniques, in this chapter, we will discuss the following topics:

  • Implementing laziness in our code: lazy enumeration, lazy evaluation, nonstrict evaluation, and lazy initialization
  • The benefit of being lazy
  • Caching expensive resources using precomputation and memoization

Introduction to laziness

When we talk about being lazy in our daily activity, we might think about something we don't do but we actually have to do. Or, we might put off doing something just because we are lazy. In functional programming, laziness is analogous to our laziness in daily activities. The execution of particular code is deferred due to the concept of laziness thinking. In Chapter 5, Querying Any Collection Easily with LINQ we mentioned that LINQ implemented deferred execution when querying data from a collection.

The query will be executed only when it's enumerated. Now, let's discuss the laziness concept we can use in the functional approach.

Lazy enumeration

In the .NET framework, there are some techniques to enumerate a collection of data, such as array and List<T>. However, implicitly, they are eager evaluations since in an array, we have to define its size first and then fill in the allocated memory before we use it. List<T> has a similar concept compared to array. It adopts the array mechanism. The difference between these two enumeration techniques is that we can easily expand the size in List<T> rather than array.

In contrast, the .NET framework has IEnumerable<T> to enumerate data collection, and fortunately, it will be evaluated lazily. Actually, array and List<T> implement the IEnumerable<T> interface, but because it has to be filled by data it has to be eagerly evaluated. We used this IEnumerable<T> interface when we dealt with LINQ in Chapter 5, Querying Any Collection Easily with LINQ.

The IEnumerable<T> interface implements the IEnumerable interface, which is defined like this:

public interface IEnumerable<out T> : IEnumerable 

There is only a single method that the IEnumerable<T> interface has: GetEnumerator(). The definition of this method is similar to what is shown in the following code:

IEnumerator<T> GetEnumerator() 

As you can see, the GetEnumerator() method returns the IEnumerator<T> data type. This type has only three methods and a single property. Here are the methods and properties it has:

  • Current: This is a property that stores the element of the collection for the current position of the enumerator.
  • Reset(): This is a method that sets the enumerator to the initial position, which is before the first element of the collection. The index of the initial position is usually -1 (minus one).
  • MoveNext(): This is a method to move the enumerator to the next collection element.
  • Dispose(): This is a method to free, release, or reset unmanaged resources. It's inherited from the IDisposable interface.

Now, let's play with the Fibonacci algorithm, which will generate infinite numbers. The algorithm will generate sequences by adding the previous two elements. In mathematical terms, the formula can be defined as follows:

Fn = Fn-1 + Fn-2 

The first two numbers for the calculation of this algorithm can be 0 and 1 or 1 and 1.

Using this algorithm, we are going to prove that the IEnumerable interface is a lazy evaluation. So, we create a class named FibonacciNumbers, which implements the IEnumerable<Int64> interface that we can find in the LazyEnumeration.csproj project, as shown in the following code:

public partial class Program 
{ 
  public class FibonacciNumbers 
    : IEnumerable<Int64> 
  { 
    public IEnumerator<Int64> GetEnumerator() 
    { 
      return new FibEnumerator(); 
    } 
    IEnumerator IEnumerable.GetEnumerator() 
    { 
      return GetEnumerator(); 
    } 
  } 
} 

Since the FibonacciNumbers class implements the IEnumerable<T> interface, it has the GetEnumerator() method, as we discussed earlier, to enumerate the data collection. And because the IEnumerable<T> interface implements the IEnumerator<T> interface, we create the FibEnumerator class, as shown in the following code:

public partial class Program 
{ 
  public class FibEnumerator 
    : IEnumerator<Int64> 
  { 
    public FibEnumerator() 
    { 
      Reset(); 
    } 
    // To get the current element 
    public Int64 Current { get; private set; } 
    // To get the last element 
    Int64 Last { get; set; } 
    object IEnumerator.Current 
    { 
      get 
      { 
        return Current; 
      } 
    } 
    public void Dispose() 
    { 
      ; // Do Nothing 
    } 
    public bool MoveNext() 
    { 
      if (Current == -1) 
      { 
        // Fibonacci algorithm 
        // F0 = 0 
        Current = 0; 
      } 
      else if (Current == 0) 
      { 
        // Fibonacci algorithm 
        // F1 = 1 
        Current = 1; 
      } 
      else 
      { 
        // Fibonacci algorithm 
        // Fn = F(n-1) + F(n-2) 
        Int64 next = Current + Last; 
        Last = Current; 
        Current = next; 
      } 
      // It's never ending sequence, 
      // so the MoveNext() always TRUE 
      return true; 
    } 
    public void Reset() 
    { 
      // Back to before first element 
      // which is -1 
      Current = -1; 
    } 
  } 
} 

Now, we have the FibEnumerator class that implements the IEnumerator<T> interface. Since the class implements IEnumerator<T>, it has the Reset(), MoveNext(), and Dispose() methods, as we have discussed already. It also has the Current property from the implementation of the IEnumerator<T> interface. We add the Last property to save the last current number.

Now, it's time to create the caller to instantiate the FibonacciNumbers class. We can create the GetFibonnacciNumbers() function, which has an implementation similar to what is shown in the following code:

public partial class Program 
{ 
  private static void GetFibonnacciNumbers( 
    int totalNumber) 
  { 
    FibonacciNumbers fibNumbers = 
      new FibonacciNumbers(); 
    foreach (Int64 number in 
      fibNumbers.Take(totalNumber)) 
    { 
      Console.Write(number); 
      Console.Write("	"); 
    } 
    Console.WriteLine(); 
  } 
} 

Because the FibonacciNumbers class will enumerate infinite numbers, we have to use the Take() method, as shown in the following code snippet, in order not to create an infinite loop:

foreach (Int64 number in 
  fibNumbers.Take(totalNumber)) 

Suppose we need to enumerate 40 numbers from the sequence; we can pass 40 as an argument to the GetFibonnacciNumbers() function, as follows:

GetFibonnacciNumbers(40) 

And if we run the preceding function, we will get the following output on the console:

Lazy enumeration

We can get the preceding output on the console since IEnumerable is a lazy evaluation. This is because the MoveNext() method will only be called to calculate the result if it's asked to do so. Imagine if it's not lazy and is always called; then, our previous code will spin and result in an infinite loop.

Lazy evaluation

The simple example we have on lazy evaluation is when we deal with two Boolean statements and need to compare them. Let's take a look at the following code, which demonstrates lazy evaluation, which we can find in the SimpleLazyEvaluation.csproj project:

public partial class Program 
{ 
  private static MemberData GetMember() 
  { 
    MemberData member = null; 
    try 
    { 
      if (member != null || member.Age > 50) 
      { 
        Console.WriteLine("IF Statement is TRUE"); 
        return member; 
      } 
      else 
      { 
        Console.WriteLine("IF Statement is FALSE"); 
        return null; 
      } 
    } 
    catch (Exception e) 
    { 
      Console.WriteLine("ERROR: " + e.Message); 
      return null; 
    } 
  } 
} 

And here is the MemberData class we use in the preceding code:

public class MemberData 
{ 
  public string Name { get; set; } 
  public string Gender { get; set; } 
  public int Age { get; set; } 
} 

And if we run the preceding GetMember() method, we will get the following output on the console:

Lazy evaluation

As we know, when we use the || (OR) operator in the Boolean expression, it will result in TRUE if at least one expression is TRUE. Now take a look at the following code snippet:

if (member != null || member.Age > 50) 

In the preceding example, when the compiler finds that member != null is FALSE, it then evaluates the other expression, which is member.Age > 50. Since the member is null, it has no Age property; so, when we try to access this property, it will throw an exception.

Now, let's refactor the preceding code snippet into the following code using the && (AND) operator:

if (member != null && member.Age > 50) 

The complete method named GetMemberANDOperator() will be as follows:

public partial class Program 
{ 
  private static MemberData GetMemberANDOperator() 
  { 
    MemberData member = null; 
    try 
    { 
      if (member != null && member.Age > 50) 
      { 
        Console.WriteLine("IF Statement is TRUE"); 
        return member; 
      } 
      else 
      { 
        Console.WriteLine("IF Statement is FALSE"); 
        return null; 
      } 
    } 
    catch (Exception e) 
    { 
      Console.WriteLine("ERROR: " + e.Message); 
      return null; 
    } 
  } 
} 

If we run the preceding GetMemberANDOperator() method, we will get the following output on the console:

Lazy evaluation

Now, the if statement has been successfully executed and it results in FALSE after being evaluated. However, the member.Age > 50 expression is never evaluated in this case so that the exception is not thrown. The reason why the member.Age > 50 expression is not evaluated is that the compiler is too lazy to do it since the first expression, member != null, is FALSE and the result of this && logical operation will result in FALSE regardless of the result of other expression. We can now say that laziness is ignoring another expression when it can decide the result using only one expression.

Nonstrict evaluation

Some people may think that lazy evaluation is synonymous with nonstrict evaluation. However, it's not actually synonymous since the evaluation of a particular expression will be ignored if it's not needed in the lazy evaluation, while the reduction of evaluation will be applied in a nonstrict evaluation. Let's take a look at the following code in order to distinguish strict and nonstrict evaluation, which we can find in the NonStrictEvaluation.csproj project:

public partial class Program 
{ 
  private static int OuterFormula(int x, int yz) 
  { 
    Console.WriteLine( 
      String.Format( 
        "Calculate {0} + InnerFormula({1})", 
        x, 
        yz)); 
    return x * yz; 
  } 
  private static int InnerFormula(int y, int z) 
  { 
    Console.WriteLine( 
      String.Format( 
        "Calculate {0} * {1}", 
        y, 
        z 
        )); 
    return y * z; 
  } 
} 

In the preceding code, we are going to calculate the formula of x + (y * z). The InnerFormula() function will calculate the multiplication of y and z, while the OuterFormula() function will calculate the addition of x and the result of y * z. When evaluating the formula in strict evaluation, we first calculate the (y * z) expression to retrieve the value and then add the result to the x. The code will be like the following StrictEvaluation() function:

public partial class Program 
{ 
  private static void StrictEvaluation() 
  { 
    int x = 4; 
    int y = 3; 
    int z = 2; 
    Console.WriteLine("Strict Evaluation"); 
    Console.WriteLine( 
      String.Format( 
        "Calculate {0} + ({1} * {2})",x, y, z)); 
    int result = OuterFormula(x, InnerFormula(y, z)); 
    Console.WriteLine( 
      String.Format( 
        "{0} + ({1} * {2}) = {3}",x, y, z, result)); 
    Console.WriteLine(); 
  } 
} 

As you can see, in the preceding code we invoke the OuterFormula() function like the following code snippet:

int result = OuterFormula(x, InnerFormula(y, z)); 

And for the strict evaluation we discussed earlier, the output we get on the console will be as follows:

Nonstrict evaluation

As you can see in the preceding figure, when we calculate 4 + (3 * 2), we first calculate the result of (3 * 2) and then after getting the result, we add it to 4.

Now, let's compare it with nonstrict evaluation. In nonstrict evaluation, the + operator is reduced first and then we reduce the inner formula, which is (y * z). We will see that the evaluation will be started from the outside to the inside. Now let's refactor the preceding OuterFormula() function to the OuterFormulaNonStrict() function, as shown in the following code:

public partial class Program 
{ 
  private static int OuterFormulaNonStrict( 
    int x, 
    Func<int, int, int> yzFunc) 
  { 
    int y = 3; 
    int z = 2; 
    Console.WriteLine( 
      String.Format( 
        "Calculate {0} + InnerFormula ({1})", 
        x, 
        y * z 
        )); 
    return x * yzFunc(3, 2); 
  } 
} 

As you can see, we modify the second parameter of the function into the Func<int, int, int> delegate. We will call OuterFormulaNonStrict() from the NonStrictEvaluation() function, as follows:

public partial class Program 
{ 
  private static void NonStrictEvaluation() 
  { 
    int x = 4; 
    int y = 3; 
    int z = 2; 
    Console.WriteLine("Non-Strict Evaluation"); 
    Console.WriteLine( 
      String.Format( 
        "Calculate {0} + ({1} * {2})",x, y, z)); 
    int result = OuterFormulaNonStrict(x, InnerFormula); 
    Console.WriteLine( 
      String.Format( 
        "{0} + ({1} * {2}) = {3}",x, y, z, result)); 
    Console.WriteLine(); 
  } 
} 

In the preceding code, we can see that we pass the InnerFormula() function into the second parameter of the OuterFormulaNonStrict() function, as shown in the following code snippet:

int result = OuterFormulaNonStrict(x, InnerFormula); 

The expression in the preceding code snippet will be evaluated using nonstrict evaluation. To prove this, let's run the NonStrictEvaluation() function, and we will get the following output on the console:

Nonstrict evaluation

We can see that our expression is evaluated from the outside to the inside. The OuterFormulaNonStrict() function is run first even though the result of the InnerFormula() function is not retrieved yet. And if we run the OuterFormula()function and the OuterFormulaNonStrict() function consecutively, we will get a clear difference in the order of the evaluation, as shown in the following output screenshot:

Nonstrict evaluation

Now, we can compare the difference. In strict evaluation, the calculation of (3 * 2) is run first and then fed to the (4 + InnerFormula()) expression, while in nonstrict evaluation, the (4 + InnerFormula()) expression is run before the calculation of (3 * 2).

Lazy initialization

Lazy initialization is an optimizing technique where the object creation is deferred until it is used. It means that we can define an object but it won't be initialized if the member of the object is not accessed yet. C# introduced the Lazy<T> class in C# 4.0, and we can use to lazily initialize the object. Now, let's take a look at the following code in order to demonstrate the lazy initialization that we can find in the LazyInitialization.csproj project:

public partial class Program 
{ 
  private static void LazyInitName(string NameOfPerson) 
  { 
    Lazy<PersonName> pn = 
      new Lazy<PersonName>( 
        () => 
          new PersonName(NameOfPerson)); 
    Console.WriteLine( 
      "Status: PersonName has been defined."); 
    if (pn.IsValueCreated) 
    { 
      Console.WriteLine( 
        "Status: PersonName has been initialized."); 
    } 
    else 
    { 
      Console.WriteLine( 
        "Status: PersonName hasn't been initialized."); 
    } 
    Console.WriteLine( 
      String.Format( 
        "Status: PersonName.Name = {0}", 
        (pn.Value as PersonName).Name)); 
    if (pn.IsValueCreated) 
    { 
      Console.WriteLine( 
        "Status: PersonName has been initialized."); 
    } 
    else 
    { 
      Console.WriteLine( 
        "Status: PersonName hasn't been initialized."); 
    } 
  } 
} 

We define the PersonName class as follows:

public class PersonName 
{ 
  public string Name { get; set; } 
  public PersonName(string name) 
  { 
    Name = name; 
    Console.WriteLine( 
      "Status: PersonName constructor has been called." 
      ); 
  } 
} 

As you can see in the preceding LazyInitName() function implementation, we lazily initialize the PersonName object using the Lazy<T> class, as shown in the following code snippet:

Lazy<PersonName> pn = 
  new Lazy<PersonName>( 
    () => 
      new PersonName(NameOfPerson)); 

By doing this, PersonName isn't actually initialized after we define the pn variable, like what we usually get when we define the class directly using the following code:

PersonName pn = 
  new PersonName( 
    NameOfPerson); 

Instead, using lazy initialization we access the object's member in order to initialize it, as discussed earlier. Lazy<T> has a property named Value, which gets the value for the Lazy<T> instance. It also has the IsValueCreated property to indicate whether a value has been created for this Lazy<T> instance. In the LazyInitName() function, we use the Value property, as shown in the following code snippet:

Console.WriteLine( 
  String.Format( 
    "Status: PersonName.Name = {0}", 
    (pn.Value as PersonName).Name)); 

We use (pn.Value as PersonName).Name to access the Name property of the PersonName class, which has been instantiated by the pn variable. We use the IsValueCreated property to prove whether or not the PersonName class has been initialized, as shown in the following code snippet:

if (pn.IsValueCreated) 
{ 
  Console.WriteLine( 
    "Status: PersonName has been initialized."); 
} 
else 
{ 
  Console.WriteLine( 
    "Status: PersonName hasn't been initialized."); 
} 

Now let's run the LazyInitName() function and pass Matthew Maxwell as its argument, as shown in the following code:

LazyInitName("Matthew Maxwell"); 

We will get the following output on the console:

Lazy initialization

From the preceding screenshot, we have five lines of information. The first line we get is when PersonName is defined. We then check the value of the IsValueCreated property to find whether PersonName has been initialized. We get the FALSE result, which means that it's not initialized yet; so we have the second line of the information on the console. The next two lines are the interesting things we get from lazy initialization. When we access the Value property of the Lazy<T> class to retrieve the Name property of the PersonName instance, the code calls the constructor of PersonName before it accesses the Name property of PersonName class. This is why we have lines 3 and 4 in the preceding console. And after we check the IsValueCreated property again, we find that PersonName has now been initialized and the pn variable has an instance of PersonName.

The advantages and disadvantages of being lazy

We have learned about laziness so far. We can also elaborate the advantages of being lazy, such as:

  • We don't need to pay the initialization time for features we don't use
  • The program execution becomes more efficient because sometimes, in the functional approach, the order of the execution is not important compared to the imperative approach
  • Being lazy will make a programmer write better code by writing efficient code

Besides the advantages, being lazy also has disadvantages, such as:

  • The flow of the application is hard to predict and we can lose control over our application sometimes
  • The code complexity in laziness can cause a bookkeeping overhead
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset