6.1. Programming recommendations

C and C++ are very powerful high level programming languages. Two applications can be performing the same tasks, but written in completely different ways using the many different features in the language. Programming style is very personal, but there are programming practices that, when followed, will help get the best results from the optimization techniques used by the compiler. This section offers some tips and guidelines that will allow you to write programs that exploit the C for AIX and VisualAge C++ for AIX compilers.

6.1.1. Variables and data structures

In C and C++, the default storage duration, scope, and linkage of variables and objects depend on where they are declared. However, storage duration can be explicitly overridden with storage class specifiers. Whenever possible, it is recommended that you use local variables of the automatic storage class.

Several optimizations performed by the compiler rely on data flow analysis. For example, whether a store into a variable is redundant and can be removed, because a later store will invalidate the first store, before the variable is referenced:

int func1()
{
    int x;
    x = 1;
    func();
    x = 2;
    return x;
}

In this example, the first assignment, x = 1, can be safely removed, since the second assignment will overwrite it, and the call to function func() will not access variable x. Had variable x been declared as global, the compiler would not have made the same assumption, since it is now possible for the function func() to be referencing the global variable x.

Data flow analysis also helps in determining whether branches can be eliminated. For example:

void func2()
{
    int x;
    x = 1;
    func();
    if (x == 1)
        printf("true
");
    else
        printf("false
");
}

Again, since the function call to func() cannot possibly be updating the local variable x, the compiler can safely remove the conditional statement, and generate code for the true part of the statement only.

Using local, automatic storage variables is not always possible. If two functions within the same source file need to share data, it is recommended that you use variables with static storage. Static storage class variables and objects have internal linkage, that is, the variable is accessible within the compilation unit only, and not visible externally. The compiler in this case can use this information to better optimize your program, since static variables appear as local within the compilation unit.

If you must use external variables that are shared between more than one compilation units, try to group the variables into structures to avoid excessive memory access.

6.1.2. Functions

Without knowing what a function does, that is, if the function is not defined in the current compilation unit, the compiler has to assume the worse, that calling the function may have side effects. A side effect, according to the standard, is a change in the state of the execution environment. Several operations can produce a side effect:

  • Accessing a volatile object

  • Modifying an external object

  • Modifying a static object

  • Modifying a file

  • Calling a function that does any of the above

If a function requires input, it is recommended that you pass the input as arguments, rather than having the function access the input from global variables. If your function has no side effects, that is, it does not violate any of the points above, you can use the #pragma isolated_call directive to specify that the compiler may take a more aggressive approach in optimization. This generally improves the execution performance. See C for AIX Compiler Reference, SC09-4960 for more details on the directive. Functions that are marked with the #pragma isolated_call directive are allowed to modify storage pointed to by pointer arguments (that is, reference arguments).

If a function is only required in the current compilation unit, declaring it as static will speed up calls to the function.

In C++, use virtual functions only when absolutely necessary. The location of virtual functions are not known until execution time. The compiler generates an indirect call to virtual functions via entries in a virtual function table, which gets populated with the location of the virtual functions at load time. To reduce code bloat, avoid declaring virtual functions inline, which tends to make the virtual function table bigger and causes the virtual function defined in all compilation units that use the class.

6.1.3. Pointers

The use of pointers causes uncertainty in data flow analysis, which can sometimes inhibit optimization performed by the compiler. If a pointer is assigned to the address of a variable, both the pointer and the variable can now be used to change or reference the same memory location in an expression. For example:

void func()
{
    int i = 55, a, b;
    int *p = &i;

    a = *p;
    i = 66;
    b = *p;
}

At first glance, it looks as if the expression *p does not need to be evaluated the second time, and the variable a and b would have the same value. However, since p and the address of i refer to the same memory location, the assignment i = 66 invalidates the first *p evaluation and b = *p must be evaluated again.

When the compiler knows nothing about where a pointer can point to, it will have to make the pessimistic assumption that the pointer can point to any variable.[1] For example:

[1] In the standard conforming language levels, e.g. -qlanglvl=stdc89 and -qlanglvl=stdc99, or when the -qalias=ansi option is in effect, a pointer can only point to variables of the same type. This is referred to as type-based aliasing. The only exception is with character pointers and pointers to void; they can point to any type. The use of -qalias=ansi to correct programming mistakes is not recommended, as it inhibits optimization opportunities and degrades execution performance.

int rc;

void foo(int *p)
{
    rc = 55;
    *p = *p + 1;
    rc = 66;
}

In this case, the compiler has to assume that when the function foo is called, pointer p can potentially point to the variable rc. The first assignment, rc = 55, is relevant and cannot be safely removed, since *p on the following line can be referring to rc.

If p is guaranteed not to be pointing to rc, you can use the #pragma disjoint directive to inform the compiler:

#pragma disjoint(*p, rc)

Marking variables that do not share physical memory storage with the #pragma disjoint directive allows the compiler to explore more optimization opportunities, and the performance gain can be quite substantial with complex applications. However, if the directive is used inappropriately on variables that may share physical storage, the program may yield incorrect results.

6.1.4. Arithmetic operations

Multiplication in general is faster than division. If you need to perform many divisions with the same divisor, it is recommended that you assign the reciprocal of the divisor to a temporary variable, and change all your divisions to multiplications with the temporary variable. For example, instead of:

double preTax(double total)
{
    return total / 1.0825;
}

this will perform faster:

double preTax(double total)
{
    return total * (1.0 / 1.0825);
}

The division (1.0 / 1.0825) will be evaluated once at compile time only due to constant folding.

6.1.5. Selection and iteration statements

When using the if selection statement, order your conditional expressions efficiently by putting the most decisive test first. For example:

struct {
    char *name;
    short len;
    _Bool active;
} rec;
...
if (rec.active == true && rec. len == 9 && !memcmp(rec.name, "Elizabeth", 9))
...

Also, order your case statements or if-else if conditions in a way that the most likely occurring conditions appear first. For example:

typedef enum { VOWEL, CONSONANT, DIGIT, PUNCTUATION } _Type;
struct {
    char ch;
    _Type type;
} word;
...
switch (word.type) {
    case VOWEL:
        ...
      break;
    case CONSONANT:
        ...
      break;
    case PUNCTUATION:
        ...
      break;
    case DIGIT:
        ...
      break;
}

and:

if (!error) {
    /* most likely condition first */
} else {
    /* error condition that does not happen too often */
}

When using iteration statements such as for loop, do loop, and while loop, move invariant expressions (expressions that do not change with loop iterations, nor depend on the loop index of a for loop) out of the loop body.

6.1.6. Expression

The C and C++ compilers are able to recognize common sub-expressions, when the sub-expression either:

  • Appears at the left end of the expression

  • Within parentheses

For example, the compiler recognizes the sub-expression a + b in the two assignments:

x = a + b + c;
y = d * (a + b);

and evaluates the two sub-expressions only once. Essentially, it is logically transforming the two lines into:

temp = a + b;
x = temp + c;
y = d * temp;

The use of the temp illustrates how the compiler can keep the result of the sub-expression evaluation in a register, and simply reuse the register for both assignments.

6.1.7. Memory usage

Avoid heap storage fragmentation with frequent allocations of small, temporary objects using memory allocation functions such as malloc() and calloc().

while (list) {
    char *temp = (char*)malloc(list->len+1);
    strcpy(temp, list->element);
    PrettyPrint(temp);
    free(temp);
    list = list->next;
}

In the example above, if the size of the longest element is known beforehand, you can replace the call to malloc() and obtain storage from the stack instead by using an array of characters:

char temp[MAX_ELEMENT_SIZE];

If the size is unknown, you can make use of the C99 feature, “Variable length arrays” on page 7 to allocate storage dynamically for the array:

char temp[list->len+1];

In either case, the storage is obtained from the stack and not from the heap, and it is freed automatically when the scope of the variable terminates.

6.1.8. Built-in functions

For performance reasons, a large number of the library functions are also provided as compiler built-ins. A compiler built-in avoids any overhead associated with function calls (for example, parameter passing, stack allocation and adjustment, and so on) by expanding the function body at the call site. Various hardware instruction built-ins are also provided to allow programmers direct access to hardware instructions.

To use the built-in version of library functions, you need to include the appropriate library header files. Including the proper header files also prevents parameter type mismatch and ensures optimal performance.

6.1.9. Virtual functions

In general, when writing C++ code, you should try and avoid the use of virtual functions. They are normally encoded as indirect function calls, which are slower than direct function calls.

Usually, you should not declare virtual functions inline. In most cases, declaring virtual functions inline does not produce any performance advantages. Consider the following code sample:

class Base {
public:
    virtual void foo() { /* do something. */ }
};

class Derived: public Base {
public:
    virtual void foo() { /* do something else. */ }
};

int main(int argc, char *argv[])
{
    Base* b = new Derived();
    b->foo(); // not inlined.
}

In this example, b->foo() is not inlined because it is not known until run time which version of foo() must be called. This is by far the most common case.

There are cases, however, where you might actually benefit from inlining virtual functions, for example, if Base::foo() was a really hot function, it would get inlined in the following code:

int main(int argc, char *argv[])
{
    Base b;
    b.foo();
}

If there is a non-inline virtual function in the class, the compiler generates the virtual function table in the first file that provides an implementation for a virtual function; however, if all virtual functions in a class are inline, the virtual table and virtual function bodies will be replicated in each compilation unit that uses the class. The disadvantage to this is that the virtual function table and function bodies are created with internal linkage in each compilation unit. This means that even if the -qfuncsect option is used, the linker cannot remove the duplicated table and function bodies from the final executable. This can result in a very bloated executable size.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset