Grouping Base Types

This section looks at ways to achieve optimizations in grouped C/C++ types. The way in which grouped types are created has an impact on the footprint as well as the performance of the type.

Structs

The use of structures as data containers seems pretty straightforward, which it is really, when you take one or two things into account before using structures for large sets of data. Listing 6.9 demonstrates several ways of creating a grouped type.

Code Listing 6.9. Structure Sizes
// Structure Size example
#include    <iostream.h>

struct A  {  char a; long b; char c; long d; } ;

struct B  {  char a; char c; long b; long d; } ;

#pragma pack(push,1)
struct C  {  char a; long b; char c; long d; } ;
#pragma pack(pop)

void main(void)
{
    cout << "Size of A: " << sizeof(A) << " bytes." << endl;
    cout << "Size of B: " << sizeof(B) << " bytes." << endl;
    cout << "Size of C: " << sizeof(C) << " bytes." << endl;
}

Listing 6.9 defines three structures that contain identical information: two long words—called d and b—and two characters—called a and c. When you run this program on a Windows system using Developer Studio, you get the following output:

Size of A: 16 bytes.

Size of B: 12 bytes.

Size of C: 10 bytes.

The reason these structures are not the same size has to do with alignment. Normally, a compiler will force long words to be placed at the next long word boundary; this means a new long word can be specified only every four memory addresses. Thus, the memory presentation of the beginning of structure A is:

Address Content
00 char a
01 stuffing
02 stuffing
03 stuffing
04 long b, byte 0
05 long b, byte 1
06 long b, byte 2
07 long b, byte 3*
* Actual byte order within a long is system dependent.

Structure B is more compact because, although long b still starts at the same address relative to the beginning of the structure, some of the stuffing is replaced by char b:

Address Content
00 char a
01 char b
02 stuffing
03 stuffing
04 long b, byte 0
05 long b, byte 1
06 long b, byte 2
07 long b, byte 3*
* Actual byte order within a long is system dependent.

Looking at these examples, it should not surprise you to find out that the two structures shown in Listing 6.10 are exactly the same size.

Code Listing 6.10. Structure Stuffing
struct Full                    struct NotSoFull
{                               {
    char a, b, c, d;               char a;
    long e;                        long e;
} ;                             } ;

So it makes sense to think about variable order when designing structures. When you use only a few instances of a structure at a time, the footprint impact of the extra stuffing bytes is of course marginal, but it is a different story entirely when you use structures to store elements of an expansive database. It is generally good practice to always use well-thought-out structure design not only because—as explained in previous chapters—software (modules) are reused in ways not always anticipated, but also because it is a good idea to make these kinds of techniques second nature.

Now let's look at our wunderkind. How did structure C manage to squeeze itself into a 10-byte hole? You have seen the #pragma pack compiler commands so you have a pretty good idea how, but what exactly happens? The #pragma pack compiler command for Developer Studio tells the compiler to temporarily adjust alignment. In the example program the alignment is set to one byte, creating the following memory mapping:

Address Content
00 char a
01 long b, byte 0
02 long b, byte 1
04 long b, byte 2
05 long b, byte 3
06 char c
07 long d, byte 0
08 long d, byte 1
09 long d, byte 2
10 long d, byte 3*
* Actual byte order within a long is system dependent.

Using push and pop ensures that the alignment for the rest of the program is not affected. Basically you push the current alignment onto some compiler stack, force the new alignment in place, and later pop the old alignment back into action. For more information on alignment consult the manual of the compiler used (look up the /Zp compiler option for Microsoft Developer Studio, or consult the manpages for the GNU compiler—man gcc or man g++).

Of course alignment is an issue for all types. Table 6.2 is an alignment table for the Windows and UNIX/Linux OSes.

Table 6.2. Base Type Alignment Table
Type Windows/Dev Studio UNIX, Linux, GNU
char 1 1
word 2 2
short 2 2
long 4 4
int 4 2
pointer 4 4
float 4 4
double 4 4
long double int64/ 8 4
long long 4 8
struct alignment of largest member type.
union alignment of largest member type.
bit field 4 4

For specific alignment details of other systems or various compilers, consult their respective documentation, or perform an alignment test. This can be as easily done as creating a structure of required types, interspersed with character variables (to make sure the compiler will actually force alignment for every type). Alternatively, use the booktool function: alignment_info() to perform this job for you.

Note

Alignment exists for a reason: Some processors simply have difficulty using odd memory addresses. When you port code that uses adjusted alignment, the new footprint of the program can thus differ, and variable access can even become somewhat slower. The implementer should therefore be aware that alignment adjusting is a system-specific optimization that needs to be tweaked to the target system (OS, hardware, and compiler combination).


Bit Fields

This section discusses the use of structures as bit fields. Bit fields are discussed in the following contexts:

  • Footprint implications

  • Performance implications

  • Pitfalls

  • Final remarks

Footprint Implications

Bit fields allow you to use even more-extreme alignment of variables. Essentially, the bit field of C++ makes it possible to declare several variables within a base type, making full use of the range of the base type. Let's say you need to create data base elements that contain six variables, four of which have a range of 0–2031 and two of which have a range of 0–1011. Without using bit fields, you will need a structure as defined in Listing 6.11.

Code Listing 6.11. Structure Without Bit Fields
struct NoBitField
{
    short     rangeAOne, rangeATwo, rangeAThree, rangeAFour;
    short     rangeBOne, rangeBTwo;
} ;

For the range 0–2031 you need 11 bits and for the range 0–1011 you need 10 bits; the smallest base type that can contain this information is the "word" (2 bytes), which is called a "short" by most C++ compilers. The size of the NoBitField structure is thus 12 bytes. However, when using bit fields our structure could look like Listing 6.12.

Code Listing 6.12. Structure with Bit Fields
struct BitField
{
    unsigned rangeAOne     : 11;    // long 1;
    unsigned rangeATwo     : 11;
    unsigned rangeBOne     : 10;
    unsigned rangeAThree   : 11;    // long 2;
    unsigned rangeAFour    : 11;
    unsigned rangeBTwo     : 10;
} ;

The compiler will pack this whole structure into two long words. The size of the BitField structure is thus eight bytes. The compiler can do this because it was told to use only 11 bits for the rangeAxx variables and only 10 bits for the rangeBxx variables. And by ordering the variables intelligently within the structure (11+11+10=32) you ensured optimal storage. The reason this order is important has again to do with alignment. Bits of the base type are assigned to the fields in the order in which they are declared, however, when a field is larger than the number of bits left over in the base type, it is "pushed" to the next base type alignment. The base type for bit fields is a long word for most compilers so when you look at the placement of rangeAOne, rangeATwo, and rangeBOne in the first long word, you will see this:

0000 0000 | 0000 0000 | 0000 0000 | 0000 0000
|  rangeBOne  |    rangeATwo  |   rangeAOne |

With a less-optimal ordering, as in the following structure, you see

struct WastingBitField
{
    unsigned rangeAOne     : 11;    // long 1;
    unsigned rangeATwo     : 11;
    unsigned rangeAThree   : 11;    // long 2;
    unsigned rangeAFour    : 11;
    unsigned rangeBTwo     : 10;
    unsigned rangeBOne     : 10;
} ;

rangeAOne and rangeATwo take up 22 bits of the first long word. This means there are 32–22=10 bits left for the following field; however, because you declared another 11-bit field, that field is moved to the next long word alignment, leaving 10 bits of the first long word unused. As a consequence, the size of the WastingBitField structure is 12 bytes.

Performance Implications

In fact, forcing boundaries on variables makes perfect sense. By putting a little thought into designing your bit fields you can easily take advantage of the space gained, and because variables are set at their closest alignment borders, the compiler can generate relatively fast code. To set a value in a bit field at most two bit operations are needed, see Listing 6.13.

Code Listing 6.13. Operations Necessary to Set Values in a Bit Field
// C code:
BitField inst;
inst.rangeBOne = 2;

// pseudo code to implement the C code:
register = long0 AND 0x003fffff
register =  register OR 0x00800000

The AND masks off everything but the bits of the rangeBOne variable, and simultaneously resets the bits of rangeBOne which need to be reset for the value 2. The OR then sets the bits that need to be set for the value 2.

But you do not always use constant values, or values from which the compiler can derive constants at compile time. When a variable is used for setting, or doing arithmetic with, a bit field, considerably more overhead is incurred. Listing 6.14 demonstrates this.

Code Listing 6.14. Speed of Bit Fields
// C++:
struct Bitfield  {  unsigned num : 11; } ;
struct Structure {  short    num; } ;

void main(void)
{
    Bitfield  a; Structure b;

    for (int i = 0; i < 100; i++)
    {
        a.num = i; b.num = i;
    }

}

// Relevant Assembly:

; 11   :         a.num = i;

    mov    eax, DWORD PTR _a$[ebp]
    and    eax, -2048                ; fffff800H
    mov    ecx, DWORD PTR _i$[ebp]
    and    ecx, 2047                ; 000007ffH
    or    eax, ecx
    mov    DWORD PTR _a$[ebp], eax

; 12   :         b.num = i;

    mov    eax, DWORD PTR _i$[ebp]
    mov    WORD PTR _b$[ebp], ax

You do not have to know Assembly to see that about three times as many instructions are needed to process the bit field (label 11) compared to the short (label 12). The exact overhead depends on the compiler and target system used. If speed and footprint tradeoffs become important when using structures, it is a good idea to test the implication for the destined target system using test cases such as the one in Listing 6.14.

Pitfalls

When you look at Listing 6.11, you see that in the NoBitField structure more than one variable was declared on a single source line. This is also possible for bit field structures but take care not to make the following mistake shown in Listing 6.15.

Code Listing 6.15. Pitfall in Using Bit Fields
struct FaultyBitfield
{
    unsigned a, b : 4; // INCORRECT!
} ;

Instead of creating two bit fields of four bits, listing 6.15 creates a bit field of 32 bits a and a bit field of four bits b. The correct way to group names is shown in Listing 6.16.

Code Listing 6.16. Correct Use of Bit Fields
struct CorrectBitfield
{
    unsigned a : 4, b : 4;        // correct.
} ;

It is also possible to use bit fields and normal base types together in the same structure as shown in Listing 6.17, however, this is not recommended.

Code Listing 6.17. Bad Use of Bit Fields
struct CombinedFields
{
    char         a;            // Bad idea.
    unsigned     b : 4;
} ;

Sadly, whenever a base type follows a bit field or vice versa, the continued memory mapping is aligned on the next long word boundary. The CombinedFields structure thus uses four bytes for the character and another four bytes for the bit field. It would be better to write the character out as a bit field of eight bits, this way the entire structure will fit into a single long word (remember; bit field structures use the long word as a base unit.)

Final Remarks

It is possible to use signed bit fields as shown in Listing 6.18. Use either int or unsigned as the bit field type.

Code Listing 6.18. Signed Bit Fields
struct SignedBitFields
{
    int          a : 10;
    unsigned     b : 10;
} ;

Remember that using signed values (int) effectively decreases the positive range of the variable.

Finally, there are two things we simply cannot do with bit fields. It is impossible to use the address of a bit field, or to initialize a reference with a bit field.

Unions

Unions are used to combine the memory space of several variables when only one of the variables will be used at a time. Consequently, the size of a union is that of the largest element it contains. Listing 6.19 shows that when the contained elements are roughly the same size, unions are a wonderful tool for saving space while maintaining strong type checking.

Code Listing 6.19. Union Size Match
union PerfectMatch
{
    char    text[4];
    int    nr;
} ;

void main(void)
{

    PerfectMatch a;

    a.text[0] = '0';
    a.text[3] = '0';
    ~
    // Somewhere else in the code:
    a.nr = 9;
}

Listing 6.19 creates two different ways to approach the same memory, allowing the use of different types.

When the elements of a union have vastly different sizes, you need to reevaluate how you use unions. Consider Listing 6.20, which represents a data base object.

Code Listing 6.20. Union for Database Fields
struct Address
{
    char    *streetName;
    char    *zipCode;
    short    houseNumber;
} ;

union Location
{
    Address    adr;
    int        poBox;
} ;

struct Client            // sizeof(Client) = 20 bytes
{
    char    *name;
    short    cityCode;

    char        locationSelector;
    Location    loc;
} ;

These structures are used to store client information. They register the name of a client, a code for the city she lives in, and her address. However, some clients are known by their post-office box number (PoBox) and others by their full address. The union Location combines this information so the space needed for the PoBox is reused for the address. The locationSelector in the Client structure tells us which kind of address is being used by an instance of the Client structure.

This example works fine, but the Client structure is 20 bytes large and when clients are known by their PoBox number, only 12 of the 20 bytes are actually necessary. This might not be a problem for a small database—it might not even be a problem when most clients are known by their address—but when you have a large number of "PoBox" clients, you are simply wasting space. Listing 6.21 gives an example of splitting the Client structure into two (observing structure alignment as discussed in the previous sections).

Code Listing 6.21. Structures Replacing a Union
struct ClientAdr            // sizeof(ClientAdr) = 16 bytes
{
    char    *name;
    char    *streetName;
    char    *zipCode;
    short    houseNumber;
    short    cityCode;
} ;

#pragma pack(push,2)
struct ClientPoBox        // sizeof(ClientPoBox) = 10 bytes
{
    char    *name;
    short    cityCode;
    int    poBox;
} ;
#pragma

For every instance of ClientPoBox that is placed in the database, 10 bytes are effectively won; without the pragmas the structure is 12 bytes, winning eight bytes. For every instance of ClientAdr, four bytes are won.

Of course you create a minor problem when using two stucts instead of one, namely that of typing the database elements. Workarounds for this are as diverse as

  • Creating separate containers (lists, arrays, and so on) for address-client objects and PoBox-client objects

  • Adding a selector to the structure or object referring to a client object and then casting client pointers to either ClientAdr or ClientPoBox

  • Using polymorphism—that is, classes instead of structs. More about this can be found in Chapter 8.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset