Little-known facts that can make the job of a programmer easier are the focus of these sections. Some of these help determine system and development environment characteristics, and are reused in the second half of the chapter, in which compatibility and portability of sources is discussed.
When mixing C and C++, you may at some point want to determine at runtime what kind of compiler was used. This is because C and C++ compilers generate subtly different code (think of calling conventions for function, sizes of variable types, and so on). Listing 14.1 shows how you can determine at runtime whether a source was compiled with a C compiler or a C++ compiler. It does this by looking at the size of a constant character. In C++ the size of a constant character ('A') equals that of the type char, which is something you would expect. In C, however, the size of a constant character equals that of the type int.
inline int cplusplus() { return(sizeof(char) == sizeof('A')); } if (cplusplus) printf("Compiled with a C++ compiler"); else printf("Compiled with a C compiler"); |
Determining the kind of compiler can, of course, also be done at compile time, which is preferable in most cases as this only costs compile time (no function needs to be executed at runtime to make this assessment). Listing 14.2 shows how you can determine at compile time whether a C or C++ compiler is used.
#ifdef __cplusplus printf("Compiled with a C++ compiler"); #else printf("Compiled with a C compiler"); #endif |
The reason this works is that C++ compilers define the __cplusplus definition automatically. For more differences between C and C++ refer to the section Compatibility later in this chapter.
As explained in Chapter 10, "Blocks of Data," in the section on Radix sort, endianness is the byte order used by a system (storing the hexadecimal long word 0xAABBCCDD in memory in the byte order 0xAABBCCDD or 0xDDCCBBAA). Knowing the byte order of a system becomes important, for instance, when sources will be used on different platforms (development platform is perhaps different from target platform; refer to Chapter 1, "Optimizing: What Is It All About?" ), or when they contain code that needs to communicate over networks and so on. Listing 14.3 shows how you can test the endianness of a platform by simply storing the value 1 (0x000001) in a long word and checking in which byte the bit is set.
Although part of the standard library, the use of a variable number of arguments in user-defined functions is not very well known. By placing ... as the last argument in a function definition, you can signal the compiler that you intend to use a variable number of arguments. This means that you can repeat the one before last argument as often as you want. For instance, a function defined as:
int UseMe(int a, ...);
can be called as:
int result = UseMe(1);
but also as:
int result = UseMe(1,2,3,4,5,6,7,8,15);
What is done during the call to such a function is that a terminated array of arguments is placed onto the function stack. Within the function, the arguments can be read from the stack one by one until the terminator is encountered. Table 14.1 shows the statements that can be used with variable argument lists.
va_list | A typedef used for creating an index in the argument list. |
va_start | A macro that initializes the index with the first argument in the list. |
va_arg | A macro that retrieves the next argument from the list. |
va_end | A macro to reset the index of the list. |
Refer to your compiler and language documentation for more details on variable argument list statement definition. Listing 14.4 shows an implementation of a function that uses a variable number of arguments to receive a list of filenames that it should open. Note that this function receives two normal arguments also.
The first argument of the function OpenFileArray in Listing 14.4 is a pointer to an array of FILE pointers. As such, it returns an array filled with the pointers to the files it has opened. The second argument is a string that defines the mode that opens the files. This is the same argument that fopen expects as a second argument. Listing 14.5 shows how you could use the OpenFileArray function.
void main(void) { FILE ** array; OpenFileArray(&array, "r", "name1.txt", "name2.txt", "name3.txt"); // Close the files. int i = 0; while (array[i] != NULL) { fclose(array[i++]); } delete [] array; } |
In order to make optimizations, it is often necessary to perform checks to determine which path of execution is most optimal in a given situation. These checks in themselves are extra overhead and you always have to think carefully about whether added overhead will be outweighed by the optimization you are trying to make. Sometimes, however, the overhead can be optimized out of existence almost entirely. Listing 14.6 shows a piece of pseudocode that contains two functions that calculate something with floating point values. One of the functions is optimized to work with a floating point processor, and the other is optimized to work without a floating point processor. Whenever a floating point calculation is needed, a check is done to determine whether a floating point processor is present in the system and the most optimal calculation function is called accordingly.
When many floating point calculations are made (perhaps you need millions per second), it is a shame that this check is performed as an extra overhead for each calculation. Who knows how time-consuming the check FLOATINGPOINTPROCESSOR actually is? That is why it helps performance when this check only needs to be executed once. Listing 14.7 shows how this o can be achieved.
// Define a pointer to Calc. void (*Calc)(void*); // Set the value of the pointer. void init(void) { if (FLOATINGPOINTPROCESSOR) Calc = &FPCalc; else Calc = &CalcWithoutFP; } |
An init() function needs to be called once, before the first calculation. After that a pointer is initialized to point to the most optimal calculation function. Anywhere in the program, a floating point calculation can be performed by simply calling:
Calc(data);
This kind of optimization should only be used for decisions with a certain dynamic nature, where the information on which the decision is based becomes available during program start-up. If the presence of a floating point processor could somehow be determined at compile time, a better solution would be to use precompiler directives as shown in Listing 14.2.
A typo often made by even the most experienced programmers is placing an if (a = 1) statement where an if (a == 1) is meant. Instead of checking whether the value of variable a equals 1, the first statement will assign the value 1 to a, and the if will always be true. These kinds of bugs can be very hard to find but are easily avoided by simply adopting a different coding style. When you train yourself to turn around the elements of which the expressions are made up, the compiler will 'warn'you when you make this kind of typo.
if (1 == a) if (NULL == fp) . . .
By placing the constant first, the compiler will complain about you trying to assign a value to a constant as soon as you forget the second =.
Because of the way compilers align data types in structures, the size of a structure is determined partly by the order in which the fields are defined (for more detail on alignment see Chapter 6, "The Standard C/C++ Variables" ). The following two structures demonstrate this:
typedef unsigned char byte; struct large { byte v1; // 20 byte structure. int v2; byte v3; int v4; byte v5; } ;
The structure large contains three bytes and two integers. Its size, however, is 20 bytes! This is because the integers are allocated at four byte boundaries. By combining the byte variables in groups of four, you can reclaim this wasted alignment space.
struct small { byte v1; // 12 byte structure. byte v3; byte v5; int v4; int v2; } ;
Structure small can hold exactly the same information as large, but it is only 12 bytes in size. This is because fields v1, v3, and v5 share the same longword. This kind of optimization can save a lot of runtime memory when a structure is used for a large number of data elements.
ANSI C specifies six macros which ANSI-compliant compilers must support. As was explained in Chapter 4, "Tools and Languages," macros are expanded at compile time, which means that they can provide compile information during runtime. Table 14.2 shows which macros are predefined by ANSI-C.
__DATE__ | Expanded to contain the date on which the source file was compiled. |
__TIME__ | Expanded to contain the time at which the source file was compiled. |
__TIMESTAMP__ | Expanded to contain the date and time at which the source file was compiled. |
__FILE__ | Expanded to contain the name of the source file which is compiled. |
__LINE__ | Expanded to contain the line number at which the macro is set (integer). |
__STDC__ | Expanded to contain the value 1 when the compiler complies fully with the ANSI-C standard (integer) |
Note that the ANSI predefined macros use a double underscore! All macros are expanded to character arrays except __LINE__ and __STDC__, which are integer values. Listing 14.8 shows how the macros __DATE__ and __TIME__ can be used to do a kind of versioning of a C++ class.
In Listing 14.8, two character arrays are defined along with the class definition in the header file. When the class is compiled, these character arrays receive the date and time of compilation. By calling the SourceInfo() function of the class, other parts of the program can ask the class when it was compiled.
// In any source file. void funct() { Any aclass; aclass.SourceInfo(); }
Because the source files of a project can be compiled at different times (think of recompiling a project in which not all files were changed) different files will often have a different compilation date and time.
Listing 14.9 shows how you can use the __FILE__ and __LINE__ macros to facilitate debugging and logging. The function LogError() receives a string and an integer and logs these. By calling this function with the __FILE__ and __LINE__ macros as arguments, LogError() will log exactly in which source file and at which line the logging function was called.
For the use of static values it can sometimes be helpful to let the compiler generate them instead of calculating them yourself. This can save time because you do not have to recalculate static values when they need to be updated (when their base or meaning changes, for instance). Listing 14.10 shows how the compiler can be abused to generate bit count information such as was needed in Chapter 13, "Optimizing Your Code Further."
When a certain amount of memory is needed in a function or a class, it pays to think carefully about whether this should be stack or heap memory—for more details on stacks and the heap refer to Chapters 8, "Functions" , and 9, "Efficient Memory Management." Even when the amount of memory is static (when the number of bytes needed is known at compile time) it is a good idea to allocate in on the heap. Listing 14.11 shows two classes: the StackSpace class, which allocates a chunk of stack memory, and the HeapSpace class, which allocates a chunk of heap memory.
Upon creation in function f(), both StackSpace and HeapSpace immediately allocate memory for a thousand instances of the WData structure. StackSpace does this by claiming stack memory as the WData array is part of the class and the class is created locally. HeapSpace does this by claiming memory from the heap by calling new. When you examine the placement of variables on the stack by function f(), for instance, by looking at the generated assembly (refer to Chapter 4), you will see that 4 bytes of stack space are reserved for variable a, 1600 bytes for variable sd, 4 bytes for variable b, and 4 bytes for variable hd. This has a number of conse quences; using stack space will be faster than using heap space, but when significant amounts of memory are used—through recursive calls for instance—using stack space can become problematic. This means that the design should specify what kind of usage is expected and what response times should be. From this, the most favorable implementation can be determined.