Profiling Python

There are a few different ways to profile a Python application. Profiling means having the application run while keeping track of several different parameters, like the number of times a function is called, the amount of time spent inside it, and so on. Profiling can help us find the bottlenecks in our application, so that we can improve only what is really slowing us down.

If you take a look at the profiling section in the standard library official documentation, you will see that there are a couple of different implementations of the same profiling interface: profile and cProfile.

  • cProfile is recommended for most users, it's a C extension with reasonable overhead that makes it suitable for profiling long-running programs
  • profile is a pure Python module whose interface is imitated by cProfile, but which adds significant overhead to profiled programs

This interface does determinist profiling, which means that all function calls, function returns and exception events are monitored, and precise timings are made for the intervals between these events. Another approach, called statistical profiling, randomly samples the effective instruction pointer, and deduces where time is being spent.

The latter usually involves less overhead, but provides only approximate results. Moreover, because of the way the Python interpreter runs the code, deterministic profiling doesn't add that as much overhead as one would think, so I'll show you a simple example using cProfile from the command line.

We're going to calculate Pythagorean triples (I know, you've missed them...) using the following code:

profiling/triples.py

def calc_triples(mx):
    triples = []
    for a in range(1, mx + 1):
        for b in range(a, mx + 1):
            hypotenuse = calc_hypotenuse(a, b)
            if is_int(hypotenuse):
                triples.append((a, b, int(hypotenuse)))
    return triples

def calc_hypotenuse(a, b):
    return (a**2 + b**2) ** .5

def is_int(n):  # n is expected to be a float
    return n.is_integer()

triples = calc_triples(1000)

The script is extremely simple; we iterate over the interval [1, mx] with a and b (avoiding repetition of pairs by setting b >= a) and we check if they belong to a right triangle. We use calc_hypotenuse to get hypotenuse for a and b, and then, with is_int, we check if it is an integer, which means (a, b, c) is a Pythagorean triple. When we profile this script, we get information in tabular form. The columns are ncalls, tottime, percall, cumtime, percall, and filename:lineno(function). They represent the amount of calls we made to a function, how much time we spent in it, and so on. I'll trim a couple of columns to save space, so if you run the profiling yourself, don't worry if you get a different result.

$ python -m cProfile profiling/triples.py
1502538 function calls in 0.750 seconds
Ordered by: standard name
ncalls  tottime  percall filename:lineno(function)
500500    0.469    0.000 triples.py:14(calc_hypotenuse)
500500    0.087    0.000 triples.py:18(is_int)
     1    0.000    0.000 triples.py:4(<module>)
     1    0.163    0.163 triples.py:4(calc_triples)
     1    0.000    0.000 {built-in method exec}
  1034    0.000    0.000 {method 'append' of 'list' objects}
     1    0.000    0.000 {method 'disable' of '_lsprof.Profil...
500500    0.032    0.000 {method 'is_integer' of 'float' objects}

Even with this limited amount of data, we can still infer some useful information about this code. Firstly, we can see that the time complexity of the algorithm we have chosen grows with the square of the input size. The amount of times we get inside the inner loop body is exactly mx (mx + 1) / 2. We run the script with mx = 1000, which means we get 500500 times inside the inner for loop. Three main things happen inside that loop, we call calc_hypotenuse, we call is_int and, if the condition is met, we append to the triples list.

Taking a look at the profiling report, we notice that the algorithm has spent 0.469 seconds inside calc_hypotenuse, which is way more than the 0.087 seconds spent inside is_int, given that they were called the same number of times, so let's see if we can boost calc_hypotenuse a little.

As it turns out, we can. As I mentioned earlier on in the book, the power operator ** is quite expensive, and in calc_hypotenuse, we're using it three times. Fortunately, we can easily transform two of those into simple multiplications, like this:

profiling/triples.py

def calc_hypotenuse(a, b):
    return (a*a + b*b) ** .5

This simple change should improve things. If we run the profiling again, we see that now the 0.469 is now down to 0.177. Not bad! This means now we're spending only about 37% of the time inside calc_hypotenuse as we were before.

Let's see if we can improve is_int as well, by changing it like this:

profiling/triples.py

def is_int(n):
    return n == int(n)

This implementation is different and the advantage is that it also works when n is an integer. Alas, when we run the profiling against it, we see that the time taken inside the is_int function has gone up to 0.141 seconds. This means that it has roughly doubled, compared to what it was before. In this case, we need to revert to the previous implementation.

This example was trivial, of course, but enough to show you how one could profile an application. Having the amount of calls that are performed against a function helps us understand better the time complexity of our algorithms. For example, you wouldn't believe how many coders fail to see that those two for loops run proportionally to the square of the input size.

One thing to mention: depending on what system you're using, results may be different. Therefore, it's quite important to be able to profile software on a system that is as close as possible to the one the software is deployed on, if not actually on that one.

When to profile?

Profiling is super cool, but we need to know when it is appropriate to do it, and in what measure we need to address the results we get from it.

Donald Knuth once said that premature optimization is the root of all evil and, although I wouldn't have put it down so drastically, I do agree with him. After all, who am I to disagree with the man that gave us The Art of Computer Programming, TeX, and some of the coolest algorithms I have ever studied when I was a university student?

So, first and foremost: correctness. You want you code to deliver the result correctly, therefore write tests, find edge cases, and stress your code in every way you think makes sense. Don't be protective, don't put things in the back of your brain for later because you think they're not likely to happen. Be thorough.

Secondly, take care of coding best practices. Remember readability, extensibility, loose coupling, modularity, and design. Apply OOP principles: encapsulation, abstraction, single responsibility, open/closed, and so on. Read up on these concepts. They will open horizons for you, and they will expand the way you think about code.

Thirdly, refactor like a beast! The Boy Scouts Rule says to Always leave the campground cleaner than you found it. Apply this rule to your code.

And, finally, when all of the above has been taken care of, then and only then, you take care of profiling.

Run your profiler and identify bottlenecks. When you have an idea of the bottlenecks you need to address, start with the worst one first. Sometimes, fixing a bottleneck causes a ripple effect that will expand and change the way the rest of the code works. Sometimes this is only a little, sometimes a bit more, according to how your code was designed and implemented. Therefore, start with the biggest issue first.

One of the reasons Python is so popular is that it is possible to implement it in many different ways. So, if you find yourself having troubles boosting up some part of your code using sheer Python, nothing prevents you from rolling up your sleeves, buying a couple of hundred liters of coffee, and rewriting the slow piece of code in C. Guaranteed to be fun!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset