This section provides a brief explanation about the 64-bit user process model. The 64-bit user process model shares the same concept of segments and pages with the 32-bit user process model. The difference is the number of available segments in the address space. In the 64-bit user process model, an address space is composed of 232 segments, while it is composed of 24 = 16 segments in the 32-bit user process model.
Therefore, the 64-bit user process can address up to 1 EB (exabytes), which is easily returned by the following simple calculation:
232 segments x 256 [MB/segment] = 232 x 28 x 220 bytes = 232+8+20 = 260 = 1 EB
To address this tremendous huge space, the pointer type is defined as 64-bit in the 64-bit user process model. The program shown in Example 3-11 prints 8 bytes in the 64-bit user process model, while it prints 4 bytes in the 32-bit user process model:
$ cc -q64 sizeofpointer.c $ a.out size of pointer data type is 8 byte. $ cc sizeofpointer.c $ a.out size of pointer data type is 4 byte.
#include <stdlib.h> #include <stdio.h> #include <unistd.h> int main(int argc, char *argv[]) { char *p; printf("size of pointer data type is %d byte. ", sizeof(p)); exit(0); } |
Figure 3-9 on page 131 illustrates the segment usage in the 64-bit user process model. The address space is divided into the areas explained in the following sections:
Section 3.3.1, “The first 16 segments (0 - 4 GB)” on page 132 (see Figure 3-10 on page 132)
Section 3.3.2, “Application text, data, and heap (4 GB - 448 PB)” on page 133
Section 3.3.3, “Default shared memory segments (448 - 512 PB)” on page 135
Section 3.3.4, “Privately loaded objects (512 - 576 PB)” on page 135
Section 3.3.5, “Shared text and data (576 - 640 PB)” on page 135
Section 3.3.6, “System reserved (640 - 960 PB)” on page 136
Section 3.3.7, “User process stack (960 PB - 1 EB)” on page 136
These segments are exempt from general use in order to keep the compatibility with the 32-bit user process model. Therefore, access to the segments 0x0, 0x1, 0xD, and 0xE are prohibited; these segments are reserved by the system. The 0x2 segment contains a few pages that are set up by the system loader upon the exec() time. Necessary data, such as command line argument values, environment variables, and errno, are stored in these pages (see Example 3-12 on page 134). The access to the rest of the 0x2 segment is prohibited.
Segments 0x3 - 0xC and 0xE are only accessible if you specify the attaching memory address with the shmat() routine. You may use these segments to share shared memory segments between 32-bit and 64-bit processes.
In general, hardcoding the attaching memory address with the shmat() routine is bad programming. Unless it is absolutely required, your 64-bit application should not specify the attaching memory address with the shmat().
Figure 3-10 illustrates the usage of first 16 segments (4 GB) in the 64-bit user process model.
When a 64-bit program is executed, the user text is mapped into the first segment in this area. Also, user data is mapped into another segment in this area.
In both cases, if a segment is not sufficient to contain text or data, another segment will be contiguously attached to the process address space.
To demonstrate this behavior, we prepared a program listed in Example 3-13 on page 134. We compiled and ran this program, as shown in Example 3-12 on page 134. In this example, the user text is loaded into the first segment (0x0000000100000000) in this area. The user data is loaded into the second segment (0x0000000110000000) in this area. The heap memory acquired by malloc() is also allocated in this segment.
$ cc -q64 underscore_symbols_64.c $ file a.out a.out: 64-bit XCOFF executable or object module not stripped $ a.out _text = 0x00000001000001f8 _etext = 0x00000001000005e8 _data = 0x00000001100005e8 _edata = 0x0000000110000838 argv[] = 0x00000000200fe8d0 environ[] = 0x00000000200fefd8 errnop = 0x00000000200fefe8 errno = 0x00000000200fefe0 &heap_mem = 0x0fffffffffffff70 heap_mem = 0x0000000110000850 |
Example 3-13 is a program simplified from the one listed in Example 3-2 on page 113 to demonstrate the segment mapping in the 64-bit user process model.
#include <stdlib.h> #include <stdio.h> #include <errno.h> extern int errnop; extern _text; extern _etext; extern _data; extern _edata; extern char *environ[]; int main(int argc, char *argv[]) { char buf[BUFSIZ]; char *heap_mem; /* heap memory. */ if ((heap_mem = malloc(BUFSIZ)) == (void *)NULL) { sprintf(buf, "malloc() failed with errno = %d", errno); perror(buf); } printf("_text = 0x%016p ", &_text); printf("_etext = 0x%016p ", &_etext); printf("_data = 0x%016p ", &_data); printf("_edata = 0x%016p ", &_edata); printf("argv[] = 0x%016p ", argv); printf("environ[] = 0x%016p ", environ); printf("errnop = 0x%016p ", &errnop); printf("errno = 0x%016p ", &errno); printf("&heap_mem = 0x%016p ", &heap_mem); printf("heap_mem = 0x%016p ", heap_mem); exit(0); } |
[12] Extern symbols that start with underscore are run-time symbols prepared by the system loader.
As we demonstrate in the later section, the 64-bit user process model gives you a very flat memory model that is easy to use, as long as there are enough physical pages to be allocated on the system (see 3.3.8, “Resource limits in 64-bit mode” on page 136).
If a 64-bit process calls shmat() or mmap() routines without specifying the attaching memory address, segments in this area will be attached to the address space contiguously.
See 3.4.4, “Shared memory limits” on page 147 for the IPC limitation in the 64-bit user process model.
If objects are loaded into the address space in the following cases, those objects will be loaded into the segments in this area:
Objects are explicitly loaded by load() and dlopen().
The file permission modes of shared objects are r-xr-x--- (no read and execute permission bits are set for others).
All the global shared text and data segments are full (very unlikely in the 64-bit user process model).
For further information about the private shared objects, see 2.9.2, “Private shared objects” on page 101.
On the first load of a shared library object, the shared library text is loaded into a segment in this area. This segment will be shared by all 64-bit user processes on the system. Also, shared library data will be created in another segment in this area per process basis at the same time.
In both cases, if there is a segment that has enough free space to contain shared text or shared library data, that segment will be used. Otherwise, another segment will be attached to the process address space.
The virtual addresses of loaded shared text objects can be examined using the genkld command. For further information about the usage of genkld, see 2.7.2, “genkld” on page 88.
All the segments in this area are reserved by the system and prohibited from the user process access.
By default, a 64-bit user process uses the last segment in this area for the user process stack. The stack grows from the last address, 0x0FFF_FFFF_FFFF_FFFF, toward the first address in this area. The address of the heap_mem pointer depicts this behavior (see Example 3-12 on page 134).
To use more than one segment for user process stack, you need to specify the -bmaxstacksize linker option. For example, to specify two segments for the user process stack, you need to compile your program as follows:
$ cc -q64 -bmaxstacksize:0x20000000 hello.c
Note
Thread stacks for Pthreads, except the initial thread within a multi-threaded processes, are allocated in the process heap. See 8.3.4, “Thread stack” on page 292 for further information about the thread stack in multi-threaded processes.
Unlike the 32-bit user process model, the data resource limit value always defines the actual upper limit of allocatable heap memory in the 64-bit user process memory model.
With the default soft data limit of 128 MB (see Example 3-6 on page 125), your application can only allocate memory from a process heap up to 128 MB. To demonstrate this, we have simplified grabheap_32.c (Example 3-3 on page 121) to get rid of the complexity, because the 64-bit user process memory model is flat (Example 3-14). If we compile and run this program under the default soft data limit, it fails to acquire a 256 MB heap:
$ cc grabheap_64.c $ file a.out a.out: 64-bit XCOFF executable or object module not stripped $ a. out 1 malloc() with 1 × 256 MB failed with errno = 12: Not enough space
#include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <errno.h> #define ONE_SEG (256 * 1024 * 1024) int main(int argc, char *argv[]) { char *p; char buf[BUFSIZ]; size_t sz; if (argc != 2) { fprintf(stderr, "Usage: %s digit-number ", argv[0]); exit(1); } sz = atoi (argv[1]); if ((p = malloc(sz * ONE_SEG)) == (void *)NULL) { sprintf(buf, "malloc() with %d x 256 MB failed with errno = %d" , sz, errno); perror(buf); } else { printf("starting address of %d x 256 MB memory heap is 0x%016p " , sz, p); printf("ending address of %d x 256 MB memory heap is 0x%016p " , sz, p + (sz * ONE_SEG) - 1); } exit(0); } |
Once the soft data limit is relaxed, the program can acquire a 256 MB heap:
$ ulimit -Sd unlimited
$ ulimit -Sa
time(seconds) unlimited
file(blocks) 2097151
data(kbytes) unlimited
stack(kbytes) 32768
memory(kbytes) 32768
coredump(blocks) 2097151
nofiles(descriptors) 2000
$ file a.out
a.out: 64-bit XCOFF executable or object module not stripped
$ a.out 1
starting address of 1 × 256 MB memory heap is 0x0000000110000850
ending address of 1 × 256 MB memory heap is 0x000000012000084f
If the soft data limit is set to unlimited, it is set to 9,223,372,036,854,775,807 byte = 0x7FFF_FFFF_FFFF_FFFF byte in the 64-bit user process model.[13] Example 3-15 shows the soft resource limit value in the 64-bit user process model (the source code is the same one listed in Example 3-10 on page 128).
[13] The value 0x7FFF_FFFF_FFFF_FFFF is defined as RLIM_INIFINITY for the 64-bit user process model in /usr/include/sys/resource.h (see Example 3-8 on page 127).
$ cc -q64 printlimits.c $ file a.out a.out: 64-bit XCOFF executable or object module not stripped $ a.out Resource name Soft Hard RLIMIT_CORE 1073741312 9223372036854775807 RLIMIT_DATA 134217728 9223372036854775807 RLIMIT_DATA 134217728 9223372036854775807 RLIMIT_FSIZE 1073741312 1073741312 RLIMIT_NOFILE 2000 9223372036854775807 RLIMIT_STACK 33554432 9223372036854775807 RLIMIT_RSS 33554432 9223372036854775807 $ ulimit -Sd unlimited $ a.out | egrep '^(Resource|RLIMIT_DATA)' Resource name Soft Hard RLIMIT_DATA 9223372036854775807 9223372036854775807 |
[14] The actual numerical limits might be changed in the future release of AIX. These values are applicable on AIX starting from Version 4.3 to 5.2.
Therefore, a 64-bit user process can acquire as much heap memory as it requests once the soft data limit is set to unlimited. In the following example, we specified 409600 as the command line parameter and the program successfully acquired 409,600 × 256 MB = 100 TB:
$ a.out 409600 starting address of 409600 × 256 MB memory heap is 0x0000000110000850 ending address of 409600 × 256 MB memory heap is 0x0000640110000850
Although the malloc() routine does not actually allocate virtual pages (virtual pages are allocated when the program touches them the first time), requesting this huge amount of memory puts unnecessary stress on the system.
The 64-bit user process model gives you a very flat memory model that is easy to use, but it is your responsibility to request the proper size of heap memory in your application. You may consider selecting one of the following methods to place a safety mechanism on your 64-bit programs:
Specify the -bmaxdata:0xNNNNNNNNNNNNNNNN linker option when compiling your source codes.
After compiling the source code, binary-edit the XCOFF header file of the generated executable file using the /usr/ccs/bin/ldedit command. For example:
$ /usr/ccs/bin/ldedit -bmaxdata:0xNNNNNNNNNNNNNNNN a.out
Specify the LDR_CNTRL environment value when running the executable. For example:
$ LDR_CNTRL=MAXDATA=0xNNNNNNNNNNNNNNNN a.out
Call the setrlimit() sub-routine in your application to explicitly set the soft data limit.
The appropriate value of 0xNNNNNNNNNNNNNNNN varies, depending on your application’s needs and the physical memory size; however, the following numbers can be good starting points:
0x0000000040000000 | 1 GB |
0x0000001000000000 | 64 GB |
0x0000002000000000 | 128 GB |
In the 64-bit user process model, the data type off_t is always defined as long long. Therefore, 64-bit processes can handle large files as long as the following conditions are met:
The file’s hard and soft limits have been relaxed. The default value of file limit is 2,097,151 disk blocks = 1 GB (see Example 3-6 on page 125).
JFS2 or large file enabled JFS is used.