6.2. Diagnosing compile-time errors

During compilation, the compiler emits diagnostic messages to the standard error device file (the terminal by default) whenever it encounters programming errors caused by invalid syntax or incorrect usage of features. There are various compiler diagnostic aids that can help you in determining where the error occurred.

6.2.1. Anatomy of a message

A compiler error message has the following format, which provides enough information about the location and reason for the error:

"file", line line.column: 15cc-nnn (sev) msg

Where:

fileThe name of the source file where the error occurred.
lineThe line in the source file where the error occurred.
columnThe column on the line where the error occurred.
ccA two-digit code identifying the compiler component that issued the diagnostic message:
 00Optimizer or code generator
 01Compiler services
 05C compiler
 06C compiler
 40C++ compiler
 47Munch utility
 86Interprocedural analysis (IPA)
nnnThe message number.
sevThe severity of the error.
 IInformational
 WWarning
 EError
 SSevere error
 UUnrecoverable error
msgShort message describing the error.

6.2.2. Useful options and compiler aids

The C and C++ compilers offer several options that help you detect and correct programming errors in you program.

The -qsrcmsg option

By default, compiler diagnostic messages are issued in the format depicted in 6.2.1, “Anatomy of a message” on page 227. Sometimes it may not be apparent what the problem is with the short explanation in the message text. The C compiler -qsrcmsg option prints the source line and a finger line pointing to where the compiler thinks the error is to the stderr file, giving a more precise explanation by showing the actual source line containing the error. For example:

        7 |         char new[n];
            .................a..
a - 1506-195 (S) Integral constant expression with a value greater than zero is
required.

The source line shown will be after macro expansion.

Compiler listing

You can use the -qsource option to ask for a compiler listing. A compiler listing contains several sections that are useful in determining what has gone wrong in a compilation. Diagnostic messages, if present, are written to the compiler listing as well. See Example F-1 on page 490 for a sample compiler listing.

  • SOURCE SECTION

    Shows the source code with line numbers. Lines containing macros will have additional lines showing the macro expansion. By default, this section only lists the main source file. Use the -qshowinc option to expand all header files as well.

  • OPTIONS SECTION

    Shows any non-default options that were in effect for the compilation. To list all options in effect, specify the -qlistopt option.

  • FILE TABLE SECTION

    Lists all the files used in the compilation. Each file is associated with a file number, and it always begin with the main source file with file number 0. It shows, for each file, from which file and line the file was included. If the -qshowinc option is also in effect, each source line in the SOURCE SECTION will have a file number to indicate which file the line came from.

  • COMPILATION EPILOGUE SECTION

    Shows a summary of the diagnostic messages by severity level, the number of lines read, and whether the compilation was successful or not.

  • ATTRIBUTE AND CROSS REFERENCE SECTION

    The -qattr and -qxref options will cause this section to be produced. Independently, they provide information on all identifiers used in the compilation. This is where you will find pertinent information about variables on their type, storage duration, scope, and where they are defined and referenced.

  • OBJECT SECTION

    The -qlist option will cause this section to be produced. It shows the pseudo assembly code generated by the compiler. This section is invaluable for diagnosing execution time problems, if you suspect the program is not performing as expected due to code generation error.

The -qinfo option

The -qinfo option causes the compiler to produce additional informational messages for possible programming errors. The extra diagnostic messages can help you in debugging your program.

Messages related to the same problem area are grouped together, and each group is controlled via a suboption. For instance, the -qinfo=ini suboption diagnoses possible problems with variables that are not explicitly initialized but should be. For a list of the groups (suboptions) supported, refer to the VisualAge C++ for AIX Compiler Reference, SC09-4959.

One new info group introduced in the Version 6 C compiler is the -qinfo=c99 suboption. It diagnoses C code that may have different behavior between the -qlanglvl=stdc89 and -qlanglvl=stdc99 language levels. For example:

$ cat test.c
#include <stdio.h>

int main()
{
    printf("sizeof(2147483648) = %d
", sizeof(2147483648));
    return 0;
}
$ cc -qinfo=c99 -c test.c
"test.c", line 4.48: 1506-786 (I) Integral constant "2147483648" has an implied
type of unsigned long int under the C89 language level. It has an implied type
of long long int under C99.

The -qsuppress option

When you use diagnostic aids like the -qinfo option, you can sometimes get numerous messages that you may not be interested in at the moment, clobbering the terminal or listing file. Use the -qsuppress option to stop those messages from being emitted by the compiler. You can suppress more than one message by listing the message numbers in a colon separated list.

The -qflag option

In some instances, you may want to ignore all messages below a certain severity level. For example, during a production build, warning messages that are deemed to be harmless by development teams will only clobber log files and cause unnecessary concerns from the build team. You can use the -qflag option to stop diagnostic messages from being emitted to the terminal and the listing file. Similarly, the -w compiler flag suppresses informational and warning messages from being emitted, and is the same as specifying the -qflag=e:e option. See 6.2.1, “Anatomy of a message” on page 227 for a list of single letter severity codes that you can use with the -qflag option.

The -qhaltonmsg option

On the contrary, with the C++ compiler, if you want to stop the compilation when a certain message is encountered, use the -qhaltonmsg option. The message is issued with a higher severity level and compilation terminates.

The -qhalt and -qmaxerr options

To stop a compilation in general when a message of a certain severity level or higher is encountered, use the -qhalt option. See 6.2.1, “Anatomy of a message” on page 227 for a list of single letter severity codes that you can use with the -qhalt option.

The -qmaxerr option allows you to stop the compilation when the specified number of times messages of a certain severity level or higher is reached. This option takes precedence over the -qhalt option.

6.2.3. Migrating from 32-bit to 64-bit

When migrating a 32-bit program to 64-bit, the data model differences (as described briefly in 2.1, “32- and 64-bit development environments” on page 38) may result in unexpected behavior at execution time. In 64-bit mode, the size of pointers and long data type are now 8 bytes long, and can lead to several conversion or truncation problems. The -qwarn64 option can be used to detect these portability errors.

int and long types

In 32-bit mode, both int and long data types are 32 bits in size. Because of this similarity, these types may have been used interchangeably. As shown in Table 2-1 on page 39, the data type long is 64 bits in length in 64-bit mode. A general guideline is to review the existing use of long data types throughout the source code. If the values to be held in such variables, fields, and parameters will fit in the range of [-231 ...231 -1] or [0...232-1], then it is probably best to use int or unsigned int instead. Also, review the use of the size_t type (used in many subroutines), since it is typedef as unsigned long.

long to int truncation

Truncation problems can occur when converting between 64-bit and 32-bit data objects. Since int and long are 32 bits in 32-bit mode, a mixed assignment or conversion between these data types did not represent any problem. It does, however, in 64-bit mode, as long is larger in size than int. When converting from long to int, either implicitly or explicitly through a cast, truncation may now occur:

void foo (long l)
{
    int i = l;
}

Without an explicit cast, the compiler is unable to determine whether the narrowing assignment is intended. If the value l is always within the range representable by an int, or if the truncation is intended by design, use a cast to silence the -qwarn64 message that you will receive for this code.

Unexpected result due to conversion to long

Due to the difference in size for int and long in 64-bit mode, conversions to long from other integral types may result in different execution behavior from 32-bit mode in some boundary cases.

When a signed char, signed short, or signed int is converted to unsigned long, sign extension may result in a different unsigned value in 64-bit mode. For example:

#include <stdio.h>
void foo (int i)
{
    unsigned long l = i;
    printf ("%lu (ox%lx)
", l, l);
}
void main()
{
    foo(-1);
}

This program will yield 4294967295 (0xffffffff) in 32-bit mode but 18446744073709551615 (0xffffffffffffffff) in 64-bit mode due to sign extension.

When an unsigned int variable with values greater than INT_MAX is converted to signed long, you will get different results between 32-bit and 64-bit mode. For example:

#include <stdio.h>
#include <limits.h>
void foo (unsigned int i)
{
    long l = i;
    printf ("%ld (0x%lx)
", l, l);
}
void main()
{
    foo (INT_MAX + 1);
}

In 32-bit mode, the value INT_MAX+1 will wrap around and yield -2147483648 (0x80000000). The same value can be represented in 64-bit mode by a 8-byte signed long and will result in the correct value of 2147483648 (0x80000000).

When a signed long long variable with values greater than UINT_MAX or less than 0 is converted to unsigned long, truncation will no longer occur in 64-bit mode. For example:

#include <stdio.h>
#include <limits.h>
void foo (signed long long ll)
{
    unsigned long l = ll;
    printf("%lu (0x%lx)
", l, l);
}
void main()
{
    foo(-1);
    foo(UINT_MA X+1ll);
}

This program will yield:

4294967295 (0xffffffff)
0 (0x0)

in 32-bit mode and in 64-bit mode:

18446744073709551615 (0xffffffffffffffff)
4294967296 (0x100000000)

When an unsigned long long variable with values greater than UINT_MAX is converted to unsigned long, truncation will no longer occur in 64-bit mode:

#include <stdio.h>
#include <limits.h>
void foo(unsigned long long ll)
{
    unsigned long l = ll;
    printf("%ld (0x%lx)
", l, l);
}
void main()
{
    foo(UINT_MAX + 1ull);
}

The higher order word is truncated and will result in 0 (0x0) in 32-bit mode, but yield the correct result of 4294967296 (0x100000000) without truncation in 64-bit mode.

When a signed long long variable with values less than INT_MIN or greater than INT_MAX is converted to signed long, truncation will no longer occur in 64-bit mode. For example:

#include <stdio.h>
#include <limits.h>
void foo (signed long long ll)
{
    signed long l = ll;
    printf("%ld (0x%lx)
", l, l);
}
void main()
{
    foo(INT_MIN - 1ll);
    foo(INT_MAX + 1ll);
}

This program will yield (in 32-bit):

2147483647 (0x7fffffff)
-2147483648 (0x80000000)

and in 64-bit mode:

-2147483649 (0xffffffff7fffffff)
2147483648 (0x80000000)

And finally, when an unsigned long long variable with values greater than INT_MAX is converted to signed long, truncation will no longer occur in 64-bit mode. For example:

#include <stdio.h>
#include <limits.h>
void foo(unsigned long long ll)
{
    signed long l = ll;
    printf("%ld (0x%lx)
", l, l);
}
void main()
{
    foo(INT_MAX + 1ull);
}

In 32-bit mode, the value INT_MAX+1ull will wrap around and yield -2147483648 (0x80000000). The same value can be represented in 64-bit mode by a 8-byte signed long and will result in the correct value of 2147483648 (0x80000000).

Pointer assignment and arithmetic

When migrating a program from 32-bit environment to 64-bit environment, it is crucial to avoid pointer corruption. Some of the possible problems are:

  • Assigning an int (32 bits) or a 32-bit hexadecimal constant to a pointer type variable (64 bits) or casting a pointer to an int will yield an invalid address, and will cause errors when the pointer is dereferenced. Also, the comparison of an int to a pointer may cause unexpected results.

  • Pointers are converted to int or unsigned int with the expectation that the pointer value will be preserved, as casting a pointer to an int will result in data truncation.

  • Without proper function prototypes, functions that return pointers will return truncated return values, as the functions are implicitly declared to return an int that is just 32 bits, instead of the expected 64 bits of a pointer.

  • The code assumes that pointers and int are the same size in an arithmetic context, as pointer arithmetic usually is a source of problems in migration. The ISO C standard dictates that incrementing a pointer yields adding the size of the data type to which it points to the pointer value. For example, if the variable p is a pointer to long, then the operation (p+1) increments the value of p by 4 bytes in 32-bit mode but by 8 bytes in 64-bit mode. Therefore, casts between long* to int* are problematic because of the size differences of pointer objects (32 bits versus 64 bits).

Incorrect pointer to int and int to pointer conversions

When a pointer is explicitly converted to an int, truncation of the high order word occurs. When an int is explicitly converted to a pointer, the pointer may not be correct and may cause invalid memory access if dereferenced. For example:

#include <stdio.h>
#include <stdlib.h>
void main()
{

    int i, *p, *q;

    p = (int*)malloc(sizeof(int));
    i = (int)p;
    q = (int*)i;
    p[0] = 55;

    printf("p = %p q = %p
", p, q);
    printf("p[0] = %d q[0] = %d
", p[0], q[0]);
}

In 32-bit mode, the pointers p and q are pointing to the same memory location. However, the pointer q is likely pointing to an invalid address in 64-bit mode, and could result in a segmentation fault when q is dereferenced.

Integer constants

A loss of data can occur in some constant expressions because of lack of precision. These types of problems are very hard to find and may have gone unnoticed so far. You should therefore be very explicit about specifying the type(s) in your constant expressions and use constant suffix {u,U,l,L,ll,LL} to specify exactly its type. You might also use casts to specify the type of a constant expression.

This is especially true when migrating to 64-bit, because integer constants may have different types when compiled in 64-bit mode. The ISO C standard states that the type of an integer constant, depending on its format and suffix, is the first (that is, smallest) type in the corresponding list (see Table 6-1 on page 236) that will hold the value. The quantity of leading zeros does not influence the type selection.

Table 6-1. ISO C99 integer constant type selection
SuffixDecimal constantOctal or hexadecimal constant
unsuffixedint long

long

long
int

unsigned int

long

unsigned long

long long

unsigned long long
u or Uunsigned int

unsigned long

unsigned long long
unsigned int

unsigned long

unsigned long long
l or Llong long longlong

unsigned long

long long

unsigned long long
Both u or U and l or Lunsigned long unsigned long longunsigned long unsigned long long
II or LLlong longlong long unsigned long long
Both u or U and II or LLunsigned long longunsigned long long

For example, a hexadecimal constant that could only be represented by an unsigned long in 32-bit will now fit within a long in 64-bit. The change in type of the constant in an expression may cause unexpected results.

Data alignment

Modern processor designs usually require data in memory to be aligned to their natural boundaries in order to gain the best possible performance. The compiler in most cases guarantees data objects to be properly aligned by inserting padding bytes immediately before the misaligned data. Although the padding bytes do not affect the integrity of the data, they can cause the layout and hence the size of structures and unions to be different than expected.

Because the size of pointers and long is doubled in 64-bit mode, structures and unions containing them as members will become bigger than in 32-bit mode. For example, consider the structure in Example 6-1 on page 237.

Example 6-1. Structure alignment
#include <stdio.h>
#include <stddef.h>
void main()
{
   struct T {
       char c;
       int *p;
       short s;
   } t;
   printf("sizeof(t) = %d
", sizeof(t));
   printf("offsetof(t, c) = %d sizeof(c) = %d
",
           offsetof(struct T, c), sizeof(t.c));
   printf("offsetof(t, p) = %d sizeof(p) = %d
",
           offsetof(struct T, p), sizeof(t.p));
   printf("offsetof(t, s) = %d sizeof(s) = %d
",
           offsetof(struct T, s), sizeof(t.s));
}

When Example 6-1 is compiled and executed in 32-bit mode, the following result indicates paddings have been inserted before the member p, and after the member s:

sizeof(t) = 12
offsetof(t, c) = 0 sizeof(c) = 1
offsetof(t, p) = 4 sizeof(p) = 4
offsetof(t, s) = 8 sizeof(s) = 2

Three padding bytes are inserted before the member p to ensure p is aligned to its natural 4-byte boundary. The alignment of the structure itself is the alignment of its strictest member. In this example, it is 4-byte due to the same member p. Therefore, two padding bytes are inserted at the end of the structure to make the total size of the structure a multiple of 4-byte (see Figure 6-1). This is required so that if you declare an array of this structure, each elements of the array will be aligned properly.

Figure 6-1. Structure padding in 32-bit mode


However, when Example 6-1 on page 237 is compiled and executed in 64-bit mode, the size of the structure doubles, which is caused by more paddings required to make the member p to fall on a natural alignment boundary of 8-byte (see Figure 6-2):

Figure 6-2. Structure padding in 64-bit mode


sizeof(t) = 24
offsetof(t, c) = 0 sizeof(c) = 1
offsetof(t, p) = 8 sizeof(p) = 8
offsetof(t, s) = 16 sizeof(s) = 2

And imagine if this structure is shared or exchanged among 32-bit and 64-bit processes, the data fields (and paddding) of one environment will not match the expectations of the other.

To eliminate the difference, and to allow the structure to be shared, you can reorder the fields in the data structure to get the alignments in both 32-bit and 64-bit environments to match. However, this may not be possible in all cases. It depends a great deal on the data types used in the structure, and the way in which the structure as a whole is used (for example, whether the structure is used as a member of another structure or array).

If you are unable to reorder the members of a structure, or if reordering alone cannot provide correct alignment, you can introduce user-defined paddings to cause the members of the structure to fall on their natural boundaries. Depending on the data types involved, a conditional compile section may be necessary. A conditional compile section will be required when the structure uses data types that have different sizes in the 32-bit and 64-bit environments.

For example, if the structure layout of the Example 6-1 on page 237 is changed to the following:

struct T {
    char c;
    short s;
#if !defined(__64BIT__)
							char pad1[4];
#endif
    int *p;
#if !defined(__64BIT__)
							char pad2[4];
							#endif
} t;

The structure will have the same size and member layout in both 32-bit and 64-bit environments:

sizeof(t) = 16
offsetof(t, c) = 0 sizeof(c) = 1
offsetof(t, s) = 2 sizeof(s) = 2
offsetof(t, p) = 8 sizeof(p) = 4

Note that this output is from a 32-bit execution. The size of the member p, sizeof(p), will be 8 in 64-bit mode. Figure 6-3 shows the member layout of the structure with user-defined padding.

Figure 6-3. Structure with user-defined paddings in both 32-bit and 64-bit mode


Important

This example is for illustrative purposes only. Sharing pointers between 32-bit and 64-bit processes is not recommended and will likely yield incorrect results.


When inserting paddings to structures, use an array of char to avoid any further alignment requirement on the paddings themselves. The natural alignment of a char is 1-byte, meaning it can reside anywhere in memory.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset