Functional programming and memoization

PHP is an imperative and not a declarative language, which means that programming is done using statements that alter the state of the program, just like other languages in the C family, and it is not composed of stateless expressions or declarations, like SQL for example. Though PHP is primarily a structural (procedural) and object-oriented programming language, we have seen, since PHP 5.3, more and more requests for change that asked for more and more structures that are functional in nature, such as generators and lambda functions (anonymous functions). Nevertheless, PHP remains for now a structural language in nature, especially when it comes to performance.

This being said, most functional programming techniques will yield fruit a few years from now, but there are still some functional programming techniques that can be used immediately in PHP that will improve performance as soon as you implement them in the code base of a project. One such technique is memoization.

Memoization is a functional programming technique in which the result of an expensive functional computation is stored and reused each time it is called within the same program. The idea is to return the static value of a function when it receives a certain input. Obviously, to avoid the invalidation of values, the function should be referentially transparent, which means that it should always return the same output when given a specific input. Of course, this comes in handy when you realize that a referentially transparent function is called many times along the critical path of an application and is computed every time. Memoization is an easy optimization to implement as it simply creates a cache to store the results of the computation.

Let's look at a simple example that will help us easily grasp the idea behind it. Let's say we have the following code along the critical path of an application:

// chap3_memoization_before.php 
 
$start = microtime(true); 
 
$x = 1; 
 
$data = []; 
 
function populateArray(Array $data, $x) 
{ 
    $data[$x] = $x; 
 
    $x++; 
 
    return $x <= 1000 ? populateArray($data, $x) : $data; 
} 
 
$data = populateArray($data, $x); 
 
$data = populateArray($data, $x); 
 
$data = populateArray($data, $x); 
 
$data = populateArray($data, $x); 
 
$data = populateArray($data, $x); 
 
$time = microtime(true) - $start; 
 
echo 'Time elapsed: ' . $time . PHP_EOL; 
 
echo memory_get_usage() . ' bytes' . PHP_EOL; 

Here, we see that the same function is called recursively many times. Moreover, it is a referentially transparent function. Therefore, it is a perfect candidate for memoization.

Let's start by checking its performance. If we execute the code, we will get the following result:

The results before implementing memoization

Now, let's implement a cache to memoize the results:

// chap3_memoization_after.php 
 
$start = microtime(true); 
 
$x = 1; 
 
$data = []; 
 
function populateArray(Array $data, $x) 
{ 
    static $cache = []; 
 
    static $key; 
 
    if (!isset($key)) { 
        $key = md5(serialize($x)); 
    } 
 
    if (!isset($cache[$key])) { 
 
        $data[$x] = $x; 
 
        $x++; 
 
        $cache[$key] = $x <= 1000 ? populateArray($data, $x) : $data; 
 
    } 
 
    return $cache[$key]; 
 
} 
 
$data = populateArray($data, $x); 
 
$data = populateArray($data, $x); 
 
$data = populateArray($data, $x); 
 
$data = populateArray($data, $x); 
 
$data = populateArray($data, $x); 
 
$time = microtime(true) - $start;
echo 'Time elapsed: ' . $time . PHP_EOL; 
 
echo memory_get_usage() . ' bytes' . PHP_EOL; 

Here are the results when executing this new version of the same code:

The results after implementing memoization

As we can see, the PHP script now runs much faster. The more often a referentially transparent function is called along the critical path of your application, the more the speed will increase when using memoization. Let's have a look at our script's performance using Blackfire.io.

Here are the results when executing the script without memoization:

The profiling report when not using memoization

Here are the results with memoization:

The profiling report when using memoization

The comparison shows that the memoized version of the script runs about eight times faster and consumes a third less memory. An important gain in performance for such an easy implementation.

One final question concerning memoization might be: can we cache the result between runs of the same script? Of course we can. It is up to you to determine the best way to cache it. You can use any standard way of caching a result. Also, there is at least one library that you can use to cache memoized results in PHP. You will find it at the following address: https://github.com/koktut/php-memoize. Please be aware that this library would not have been a good option for our last script as it doesn't work well with recursive tail-calls.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset