3.6. Large page support

Historically, AIX supported 4 KB page size only. Starting with AIX 5L Version 5.1 plus 5100-02 Recommended Maintenance Level, AIX supports alternate page size (called large pages), in addition to the traditional 4 KB page size. The large page size on systems using the POWER4 processor is 16 MB, but the large page size may be different size on future architectures. It is recommended to call the sysconf() routine with the _SC_LARGE_PAGESIZE parameter in your applications, in order to determine the supported large page size.

To verify you are using POWER4 processor systems, run the lscfg -vpl procX command, where X is the instance number of processors. The following example shows the output of this command executed on an AIX 5L Version 5.2 partition on the pSeries 690:

# lsdev -Cc processor
proc0      Available 00-00        Processor
proc1      Available 00-01        Processor
# lscfg -vpl proc0
  proc0            U1.18-P1-C1 Processor

        Device Specific.(YL)........U1.18-P1-C1

PLATFORM SPECIFIC

  Name:  PowerPC,POWER4
    Node:  PowerPC,POWER4@0
    Device Type:  cpu
    Physical Location: U1.18-P1-C1

The following sections are excerpted from the technical white paper, AIX Support for Large Pages, found at:

http://www.ibm.com/servers/aix/whitepapers/large_page.html

3.6.1. Large page support overview

Large page usage is primarily intended to provide performance improvements to high performance computing (HPC) applications. Memory access intensive applications that use large amounts of virtual memory may obtain performance improvements by using large pages. The large page performance improvements are attributable to reduced translation look-aside buffer (TLB) misses due to the TLB being able to map a larger virtual memory range. Large pages also improve memory prefetching by eliminating the need to restart prefetch operations on 4 KB boundaries.

The POWER4 large page architecture requires that all virtual pages in a 256 MB segment be the same size. AIX uses this architecture to support a mixed mode process model. Some segments in a process are backed with 4 KB pages and 16 MB pages back other segments. Applications may request that their heap segments be backed with large pages. Applications may also request that shared memory segments be backed with large pages. Other segments in a process are backed with 4 KB pages.

AIX supports large page usage with both 32- and 64-bit applications. Both the 32- and 64-bit versions of the AIX kernel support large pages.

AIX maintains separate 4 KB and 16 MB size physical memory pools. The customer specifies the amount of physical memory in the 16 MB memory pool using the vmo command on AIX 5L Version 5.2.[20] This amount of physical memory is allocated to the 16 MB memory pool at boot time. The remaining physical memory is used to back 4 KB virtual pages. The size of the 16 MB pool is fixed at boot time and cannot be changed without rebooting the system.

[20] On AIX 5L Version 5.1, use the vmtune command instead of vmo.

On AIX Versions of 5.1 and 5.2, large pages are not paged and treated as pinned memory. Therefore, an application’s data backed by large pages remains in physical memory until the application completes.[21] A security access control prevents unauthorized applications from using large pages. This prevents unauthorized applications from using large page physical memory and preventing authorized users from using large pages for their applications.

[21] The implementation of large pages may be changed in the future versions of AIX. Do not depend on large pages being pinned when developing your applications.

3.6.2. Large page application usage

Applications may use large pages in two ways. An application may request that large pages back its data and heap segments. An application may also request shared memory segments be backed by large pages.

Large page data/heap segments

An application may request that its initialized program data, uninitialized program data (BSS), and heap segments be backed with large pages. There are two ways to request large pages back an application’s data/heap segments:

  • The executable file can be marked to request large pages.

  • An environment variable can be set to request large pages.

A program’s large page data/heap use is established when the program is exec()ed. A program cannot switch modes after it has begun executing. Large page use is inherited by children processes on fork().

Marking an executable for large page use

The XCOFF header in an executable file contains a new flag to indicate that the program wants to use large pages to back its data and heap segments. This flag can be set when the application is linked by specifying the -blpdata option on the ld command. The flag can also be set or cleared using the ldedit command. The ldedit -blpdata filename command sets the large page data/heap flag in the specified file. The ldedit -bnolpdata filename clears the large page flag. The ldedit command may also be used to set an executable’s maxdata value. To check if the flags are correctly set, see 3.2.5, “Checking large memory model executables” on page 124.

Environment variables for large page use

An environment variable is provided to allow users to indicate they want an application to use large pages for an application’s data and heap segments. The environment variable takes precedence over the executable large page flag. Large page usage is provided as options on the LDR_CNTRL environment variable.

LDR_CNTRL=LARGE_PAGE_DATA=Y

Specifies that the exec()ed program should use large pages for its data and heap segments. This is the same as marking the executable to use large pages.

LDR_CNTRL=LARGE_PAGE_DATA=N

Specifies that the exec()ed program should not use large pages for its data and heap segments. This overrides the setting in a executable marked to use large pages.

LDR_CNTRL=LARGE_PAGE_DATA=M

Specifies that the exec()ed program should use large pages in a mandatory mode for its data and heap segments.

You can separate multiple options on the LDR_CNTRL environment variables by using an ’@’ character. For example, the following LDR_CNTRL environment variable setting requests large page usage along with the maxdata option:

LDR_CNTRL=MAXDATA=0x80000000@LARGE_PAGE_DATA=Y

Users are advised to be cautious in their use of the environment variable to specify large page usage. Performance tests have shown there can be a significant performance loss in environments where a number of shell scripts or small, short running applications are invoked. One example saw a shell script’s execution time increase over 10 times when the large page environment variable was specified. Customers are advised to only set the large page environment variable around specific applications that can benefit from large page usage.

Advisory and mandatory modes

An application can indicate that it wants to use large pages for data/heap segments in either advisory or mandatory mode. In advisory mode, the application will use large pages if possible. The conditions needed to use large pages are:

  • The user ID is authorized to use large pages.

  • The system is running on a machine that has the POWER4 large page architecture feature.

  • The customer defined a large page memory pool.

  • There are enough pages in the large page memory pool to back the entire segment with large pages.

If all of these conditions are met, the application’s data/heap segments will be backed with large pages. Otherwise, the application’s data/heap segments will be backed with 4 KB pages.

In advisory mode, an application may have some of its heap segments backed by large pages and some of them backed by 4 KB pages. 4 KB pages are used to back segments when there are not enough large pages available to back the segment. Executable programs marked to use large pages use large pages in advisory mode.

In mandatory mode, the brk() or sbrk() system calls, which are internally called from the malloc() subroutine, will fail if the application requests a heap segment and there are not enough large pages to satisfy the request. Customers that use the mandatory mode must monitor the size of the large page pool and ensure it does not run out of large pages. Otherwise, their mandatory large page mode applications may fail.

Large page data/heap segments fully backed

The POWER4 architecture requires all pages in a segment (256 MB) be backed with the same size physical pages. AIX backs the entire 256 MB segment with large pages when an application requests a large page heap segment. Even if only a few bytes are needed in the new heap segment, the entire 256 MB segment is backed. AIX does this to avoid terminating applications when they want to grow a heap segment (such as when using malloc() or sbrk()) and there are no large pages available to back the new space. This supports the advisory mode of large page usage. It also eliminates the need for installations to closely monitor the size of their large page physical memory pools.

Using large pages to back shared memory segments

AIX uses the POWER4 large page architecture feature to provide large page backing for shared memory segments. Applications can request their shared memory segments be backed with large pages by specifying both the SHM_LGPAGE and SHM_PIN flags on the shmget() function.

The request to use large pages to back a shared segment is advisory. Large pages will back a shared memory segment under the same conditions as advisory mode large page data/heap usage. A shared segment is silently backed with 4 KB pages if large pages are not available.

The physical memory to back large page shared memory and large page data/heap segments comes from the large page physical memory pool. Customers must size their large page physical memory pool to contain enough large pages for both shared memory and data/heap large page usage.

3.6.3. Large page usage security capability

AIX provides a security mechanism to control the use of large page physical memory by non-root users. The large page physical memory pool is a fixed size, pinned memory system resource. The security mechanism prevents unauthorized users from using the large page pool and thus preventing its use by the intended users or applications.

Non-root users must have a CAP_BYPASS_RAC_VMM capability in order to use large pages. A system administrator can grant this capability to a user by using the chuser command. The following command grants the ability to use large pages to user lpuserid:

chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE lpuserid

Both large page data/heap and large page shared memory segments are controlled by this capability.

3.6.4. Configuring system to use large pages

The customer must configure the system to use large pages. The customer must specify the amount of physical memory to be used to back large pages. The default is to not have any memory allocated to the large page physical memory pool.

AIX 5L Version 5.2

The vmo command is used to configure the size of the large page physical memory pool on AIX 5L Version 5.2.[22] The following command will allocate 256 pages x 16 MB = 4 GB to the large page physical memory pool:

[22] The vmo command is included in the bos.perf.tune fileset.

vmo -r -o lgpg_regions=256 -o lgpg_size=1677216 -o v_pinshm=1

Where:

-rUpdates the /etc/tunable/nextboot file so that the modified tunable values will take effect after the next system reboot.
-o lgpg_regions=256Specifies the reserved memory blocks for large pages.
-o lgpg_size=1677216Specifies the large page size in bytes. The allowable value is 16777216 (16 MB) on POWER4-based systems.
-o v_pinshm=1Allows pinning of shared memory segments.

You must run the bosboot command and reboot before the new size large page memory pool takes effect.

AIX 5L Version 5.1

The vmtune command is used to configure the size of the large page physical memory pool on AIX 5L Version 5.1.[23] The following command will allocate 256 pages x 16 MB = 4 GB to the large page physical memory pool:

[23] The vmtune command is located in the /usr/samples/kernel directory.

vmtune -g 16777216 -L 256

The -g option specifies the large page size in bytes. The allowable value is 16777216 (16 MB) on POWER4-based systems. The -L option is the number of the -g sized blocks that are allocated to the large page physical memory pool.

You must run the bosboot command and reboot before the new size large page memory pool takes effect.

If you want to use large pages for shared memory in your applications, the application source codes must be modified to use the SHM_PIN shmget() system call flag. The following vmtune command makes the necessary changes in the kernel to support the SHM_PIN flag:

vmtune -S 1

Note

The vmtune command must be called after every system boot. To place a permanent change into the system, insert the following lines into /etc/inittab:

vmtune -g 16777216 -L 256
vmtune -S 1


Considerations when determining large page pool size

Here are some things to consider when determining the size of the large page physical memory pool:

  • Memory allocated to the large page physical memory pool is not available to back 4 KB pages. Allocating too much physical memory to large pages will degrade system performance to the point of not having enough memory to back 4 KB pages. During system boot, AIX reserves enough physical memory for 4 KB pages to ensure that the system will boot. However, system failures may occur after booting if there is not enough physical memory to back 4 KB pages.

  • The size of the large page physical memory pool is fixed at boot time and remains the same for the entire boot. A reboot is required to change the size of the large page memory pool.

  • Large pages are only used for applications that explicitly request them. There is no need for a large page memory pool if your applications do not request them.

  • Advisory mode large page applications will use large pages if there are large pages available. If not, advisory mode large page applications will use 4 KB pages. However, the inverse is not true. A 4 KB application will not use large pages if the system runs low on 4 KB pages.

  • Mandatory mode large page applications will fail if the application requests a large page and one is not available.

3.6.5. Other system changes for large pages

The mprotect() function can not be used against a large page. It returns a -1 return code with an EINVAL errno if called to modify the protection attributes of a large page.

Some debug malloc tools use mprotect() to diagnose memory management problems. These tools will not work properly with large pages. Such applications must use 4 KB pages.

Multi-threaded applications may use large pages for their data/heap segments. However, when large pages are used, the libpthreads library does not place a protected red zone page at the bottom of a pthread’s stack.

The sysconf(_SC_LARGE_PAGESIZE) function call will return the large page size on systems that have large pages.

The vmgetinfo() function returns information about large page pools size and other large page related information.

3.6.6. Large page usage considerations

Large page is a special purpose performance improvement feature. It is not recommended for general use. Large page usage provides performance value to a select set of applications. These are primarily long running memory access intensive applications that use large amounts of virtual memory.

Not all applications benefit by using large pages. Some applications can be severely degraded by the use of large pages. Applications that do a large number of fork()s (such as shell scripts) are especially prone to performance degradation when large pages are used. Tests have shown a tenfold increase in shell script execution time when the LDR_CNTRL environment specifies the large page usage variable. Consider marking specific executable files to use large pages rather than using the LDR_CNTRL environment variable. This limits large page usage to the specific applications that benefit from large page usage.

Consider the overall performance effect that large pages may have on your system. While some specific applications may benefit from large page use, the overall performance of your system may be degraded by large page usage due to having reduced the amount of 4 KB page storage available in the system. Consider using large pages when your system has sufficient physical memory such that reducing the number of 4 KB pages does not significantly impact overall system performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset