Invoking Functions

Let's first look at what actually happens when a function call is made. Previous chapters have already hinted at function-call overhead and now seems to be the ideal time to discover exactly what this so-called overhead is. Consider therefore the simple C/C++ function presented in Listing 8.1.

Code Listing 8.1. A Simple Add Function
int MyAddFunct(int a, int b, int c)
{
    return a + b + c;
}

void main(void)
{
    int a = 1, b = 2 , c = 3;
    int res =
						
						 MyAddFunct(a,b,c);
}

The function MyAddFunct in Listing 8.1 takes three integers as input and returns as a result the sum of the three. By compiling this source with the assembly output option turned on (see Chapter 4, "Tools and Languages" ), you can look at what the processor actually has to do to execute this function. The assembly listings presented in this section were generated with Developer Studio for Windows, but other compilers will give similar results. A column of line numbers has been added to the output to facilitate referencing from the explanatory text. The first line(s) in each listing depicts the original C/C++ statement. Listing 8.2 shows the assembly generated for the call to MyAddFunction.

Code Listing 8.2. Assembly for Calling MyAddFunct()
// MyAddFunct(a,b,c);

00   mov eax, DWORD PTR _c$[ebp] ; place value of 'c'in register eax.
01   push eax                    ; push register eax onto the stack.
02   mov ecx, DWORD PTR _b$[ebp] ; place value of 'b'in register ecx.
03   push ecx                    ; push register ecx onto the stack.
04   mov edx, DWORD PTR _a$[ebp] ; place value of 'a'in register eax.
05   push edx                    ; push register eax onto the stack.
06   call ?MyAddFunct@@YAHHHH@Z  ; call
						
						 MyAddFunct()

As can be seen in Listing 8.2, before a function is even called a lot of work is already done. In lines 00–05 the three variables that are input for the function (a, b, and c) are placed on the top of the stack, so they become part of the function's local parameter set. Then, in line 06, the function is actually called. This causes the value of the program counter to be placed into the stack also. At any time, the program counter points to the memory address from which the processor is retrieving instructions. In calling the function, the program counter is set to point to the address of the first instruction of the function. Because the previous value of the program counter resides safely on the stack, execution can continue with the instruction after the function call when the function has ended.

Note that this part of the function call overhead (placing variables on the stack) will grow larger when more (or larger) parameters are passed to a function, and smaller when fewer (or smaller) parameters are passed. Now let's look at what happens inside the function. Listing 8.3 shows what happens on entry of the function.

Code Listing 8.3. Assembly for Entering MyAddFunct()
// int MyAddFunct(int a, int b, int c)
// {

07  push ebp            ; place the base pointer on the stack.
08  mov ebp, esp        ; new base pointer 
						
						value.

The statements in Listing 8.3 are executed before the actual function body statements. So what exactly is the overhead that is incurred when entering a function? Basically it is more stack work. The base pointer (ebp)—which tells the processor where to find local variables—is pushed onto the stack. Its value will be needed when the function ends and access is needed to the local variables of the calling context. After the base pointer is placed onto the stack, the base pointer receives the value of the stack pointer. Now pointing at the top of the stack, the base pointer can be used to retrieve (albeit via a negative index) the function parameters. This is logical, as the base pointer now points at the stack frame containing the function's local variables.

This part of the function call overhead is pretty standard. It basically saves the current context onto the stack so it can be used again when the function ends. The more registers that are used inside the function, the more registers will have to be pushed onto the stack for safekeeping. This means two things. First, the function MyAddFunct is obviously not going to use a lot of different registers. Second, the more complicated a function is, the more overhead will be incurred on function entry. Figure 8.1 shows what the stack looks like at this point inside the function call.

Figure 8.1. Stack frame when inside a function call.


Arriving at the instructions that form the function body, Listing 8.4 shows the assembly statements that make up the body of the MyAddFunct function.

Code Listing 8.4. Assembly for the Body of MyAddFunct()
// return a + b + c;

09  mov eax, DWORD PTR _a$[ebp] ; value of 'a'in register eax.
10  add eax, DWORD PTR _b$[ebp] ; add value of 'b'to eax.
11  add eax, DWORD PTR _c$[ebp] ; add
						
						
						 value of 'c'to eax.

The body statement is represented by three assembly lines (09–11). The register eax, which is used for returning values from functions, is loaded with the value of the variable a, after which the values of b and c are added. It should be clear from this example exactly how parameters are passed to functions; for more information, see Chapter 4.

So is this it then? No, not exactly; there is still the exiting of the function and returning the result. Listing 8.5 shows what happens when MyAddFunct terminates.

Code Listing 8.5. Assembly for Exiting MyAddFunct()
// }

12    pop     ebp           ; return ebp to its original value.
13    ret     0             ; return from
						
						 the function.

Basically this is the opposite of Listing 8.2; saved register values are taken from the stack in reverse order, line 12. At this point all that remains on the stack are the three parameters and the program counter. Line 13 returns from the function and effectively pops the program counter from the stack and causes execution to continue at the instructions right after the function call. Listing 8.6 shows how the returned result of function MyAddFunct is captured by the calling function.

Code Listing 8.6. Assembly for Capturing the Result of MyAddFunct()
// int res = ...

14  add esp, 12                    ; pop 3 variables * 4 bytes.
15  mov DWORD PTR _res$[ebp], eax  ; eax contains the function
                                   ; 
						
						result.

The three variables were still on the stack so the stack pointer is adjusted by 12 bytes (as the stack grows downwards in memory an add is used instead of a sub).

So all in all quite a lot of instructions had to be executed to make this calculation, which itself was only three instructions long.

When a function is used often, and it does not contain that many statements, it would be great if it were possible to somehow skip the overhead shown in this section, while still keeping the level of abstraction created by the function call. The rest of this section will discuss techniques to do just that, as well as some other useful techniques.

Macros

One way to avoid function-call overhead is by using macros. You can define a macro in both C and C++ with the #define keyword . Just as with any other use of the #define keyword, a macro basically tells the precompiler to substitute a specified sequence for an identifier:

#define TRUE 1

With the definition above, the precompiler is told to replace all occurrences of the word TRUE in the source with the character 1, before handing the source over to the compiler (for more information on the precompiler, see Chapter 4). With function macros we take things a little further as we are able to use function-like arguments. Listing 8.7 shows a few much -used macro definitions.

Code Listing 8.7. Macro Examples
#define MAX(a,b)    ((a) < (b) ? (b) : (a))
#define MIN(a,b)    ((a) < (b) ? (a) : (b))
#define ABS(a)      (a) < 0 ? -(a) : (a)

#define TSTBIT(a,b) ((a & (1<<b)) ? (true): (false))
#define SETBIT(a,b) ((a | (1 << b)))
#define CLRBIT(a,b) ((a & (~(1 << b))))
#define FLIPBIT(a,b) (TSTBIT(a,b) ? (CLRBIT(a,b)) : 
							
							
							 (SETBIT(a,b)))

When you look at the MAX macro in Listing 8.7, you see that the identifier part—MAX(a,b)—contains the parameters a and b. These parameters are repeated in the replacement string. The precompiler will thus be able to make a parameterized substitution by taking whichever parameters it finds between the brackets and placing them in the appropriate spots within the substitute string. Listing 8.8 uses this macro and demonstrates immediately why macros are not always a good idea.

Code Listing 8.8. Unexpected Macro Results
#define MAX(a,b) ((a) <(b) ? (b):(a))

void main(void)
{
        int  a= 10;
        char b = 12;

        short c = MAX(a,b);  // a = 10, b = 12, c = 12 (ok)
        c = MAX(a++,b++);    // a = 11, b = 14, c = 13 
							
							 (not ok)
}

The first use of the macro in Listing 8.8MAX(a,b)—seems to work fine; it returns the larger of the two given parameters and places it in variable c. The second call, however, is expanded a little differently than the implementer might have intended:

MAX(a++,b++);   =>  ((a++) < (b++) ? (b++) : (a++))

The variables a and b are incremented on evaluation; variable b is incremented once more when it is returned. This demonstrates how cryptic the use of macros can be, particularly when you are using functions written by a third party. You might not even be aware that the function you are calling is in fact a macro and therefore has some peculiarities. Moreover, both macro calls in the example use different types of variables; MAX can return either a character or an integer but the result is stored in a short, and the compiler does not complain about this.

It seems it is time to list the advantages and disadvantages of using macros:

The following is the advantage of using macros:

  • They can make function execution faster by eliminating function call overhead.

The following are disadvantages of using macros:

  • Often macro descriptions are very cryptic and hard to read.

  • Use of macros increases footprint as macro bodies are expanded wherever they are used.

  • Finding compiler errors in macros can be a difficult task.

  • Debugging a source with macros can be problematic, because debuggers (as other development tools) do not always know exactly how to handle macros.

  • Macros do not facilitate type-checking.

Conclusion: Although the use of macros can certainly speed up code that uses small functions repetitiously, you should be wary of using them. Most often there are other ways to solve the problems. Use macros only if and when you really feel you have to.

Inline Functions

Another way of avoiding function call overhead is the use of inline functions. Inline functions are a C++ feature and therefore cannot be used in C. They operate under a principle similar to that of macros—namely that the function call is actually substituted by the function body during compile time. This can be a great timesaver for short functions which are called frequently. Think of class member functions for accessing private data; they appear very often in C++ sources but usually do nothing more than return a pointer or variable value. Every time such an access function is called, though, function-call overhead is incurred that is many times larger than the actual function body. An ideal situation for optimization it seems.

There are two ways of declaring inline functions: first, by adding the keyword inline to the function definition, and second, by adding the function body in the definition of a member function of a class. Listing 8.9 shows the two methods of declaring inline functions.

Code Listing 8.9. Two Ways to Inline Functions
inline int MyAddFunct(int a, int b, int c)  // inlining a non class
                                            //   member function.
{
    return a + b + c;
}
class Client
{
private:
    char                *name;      // Private data.
    unsigned char       age;
    char                gender;

public:
    char *GetName() {  return name;}  // inlining access to name.
    int  GetAge() {  return age;}     // inlining access to age.
    char GetGender();
} ;

inline char Client::GetGender()     // inlining access to gender.
{
    return 
							
							
							gender;
}

Let's look at the advantages and disadvantages of inlined functions:

Inlining offers the following advantages:

  • Faster function calls

  • Easy optimization as only a keyword has to be added, or a definition moved to a header file

  • Retains the benefit of type-checking for parameters

Inlining also offers the following disadvantage:

  • Increased footprint as the function body appears everywhere where the inlined function is called

Conclusion: Use inlining only for short functions that are called frequently or that are part of a time-critical piece of code.

Sadly, however, for practically all compilers the inlining of functions is merely a suggestion made by the programmer. This means that making a function inline is no hard guarantee that the compiler will in fact substitute function calls for function bodies. It is possible that some or all inlined functions are treated like normal function definitions. For more information on inlining implementation, consult the documentation provided with the compiler you use.

Iteration Versus Recursion

A third way of avoiding function-call overhead is by simply expanding a function by hand. This way you have none of the dangers of macros, none of the uncertainty of inline functions, and most definitely all the work to do. The main part of this work is making sure the expanded functionality is correct every time, so this means no copy and paste mistakes! Still, sometimes it can be worth considering, especially where a function is called many times in the same place. This section examines a function expanded within an iterative process, compared to using the same functionality recursively. The thing to note here is that, although the expanded function is called many times, the expansion is only coded in one place in the source. This means no copy/paste dangers and no footprint issues.

Iteration is simply repeating a certain functionality over and over again until some stop condition is met. Think, for example, of a for or while loop iterating over array elements.

Recursion basically comes down to using a function that at some point makes a call to itself. This second invocation can of course call itself also and so on. It's a bit like standing between two mirrors and looking at your reflection.

This section will use the mathematical factorial function—denoted by the ! sign in mathematics—as an example. The factorial for number n is defined as the product of all integers between 1 and n itself. For example

5! equals 5´ 4´ 3´ 2´ 1=120

4! equals 4´ 3´ 2´ 1=24

Looking at the example calculations it becomes apparent that 5! equals 5 * 4!, or, to put it more generically:

n! = n(n-1)!

where n > 0

Having defined n! in terms of simpler cases of itself, it is now possible to write a recursive implementation. Listing 8.10 shows what such an implementation can look like.

Code Listing 8.10. Recursive Example of n!
long 
							
							FactRec(int n)
{
    if (n == 0)
        return 1;
    else
        return (n * FactRec(n-1));
}

Note that as multiplication is used, the result of a factorial grows rapidly. When the function FactRec() of Listing 8.10 is called with int result = FactRec(3); the function will keep calling itself, passing a smaller version of n each time.

FactRec(3) -> FactRec(2) -> FactRec(1).

Eventually the final function call will receive a value of n which is 1. This deepest FactRec() in the recursion returns the value 1. The FactRec() that called it—FactRec(2)—receives the 1 and multiplies it by its own value of n which is 2. The result is again returned to the calling FactRec() invocation.

Now let's look at how n! can be made iterative. Listing 8.11 shows an iterative version of n!.

Code Listing 8.11. Iterative Example of n!
long FactIter(int n)
{
    int prod = 1;

    while (n != 0)
    {
        prod *= n;
        n--;
    }
    return
							
							 prod;
}

The iterative function in Listing 8.11 is a bit longer in lines of code than the recursive function in Listing 8.10, but perhaps more intuitive to read. When you run both examples through the time_fn() test as described in Chapter 5, "Measuring Time and Complexity," you uncover the following results:

Recursive 5
Iterative 2.2

These relative numbers tell you, even without more precise measurement, or looking at the generated assembly, that the recursive function is a lot slower than the iterative function. This should come as no surprise as you have seen in the beginning of this chapter that function calls—of which a lot are used in recursion—cause considerable slowdown for relatively small functions. Moreover, the recursive function actually uses more runtime memory as it repeatedly places function call overhead on the stack. The factorial of 10 will in fact result in 11 calls to the FactRec() function. This touches on a major drawback of recursive functions: Their use is limited by the amount of stack that is available. This means predictions have to be made at implementation/design time; when using recursion you simply have to guarantee that the function will never be called with a parameter (or parameters) that will cause a stack overflow, or an unallowable system slowdown.

Summarizing, the following rules of thumb can be defined for when to use recursion:

  • When a function can be defined in terms of itself and cannot easily be made iterative.

  • When repetitive actions have to be done on a varying set of data (like walking through subdirectories or traversing a tree—see the example in Chapter 11, "Storage Structures." )

  • When you can guarantee that the recursive call will never cause system interference.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset