Programs and memory

"If you’re willing to restrict the flexibility of your approach, you can almost always do something better."

John Carmack

As a motivation to understand memory and its management, it's important for us to have a general idea of how programs are run by the operating system and what mechanisms are in place that allow it to use memory for its requirements.

Every program needs memory to run, whether it's your favorite command-line tool or a complex stream processing service, and they have vastly different memory requirements. In major operating system implementations, a program in execution is implemented as a process. A process is a running instance of a program. When we execute ./my_program in a shell in Linux or double-click on my_program.exe on Windows, the OS loads my_program as a process in memory and starts executing it, along with other processes, giving it a share of CPU and memory. It assigns the process with its own virtual address space, which is distinct from the virtual address space of other processes and has its own view of memory.

During the lifetime of a process, it uses many system resources. First, it needs memory to store its own instructions, then it needs space for resources that are demanded at runtime during instruction execution, then it needs a way to keep track of function calls, any local variables, and the address to return to after the last invoked function. Some of these memory requirements can be decided ahead at compile time, like storing a primitive type in a variable, while others can only be satisfied at runtime, like creating a dynamic data type such as Vec<String>. Due to the various tiers of memory requirements, and also for security purposes, a process's view of memory is divided into regions known as the memory layout.

Here, we have an approximate representation of the memory layout of a process in general:

This layout is divided into various regions based on the kind of data they store and the functionality they provide. The major parts we are concerned with are as follows:

  • Text segment: This section contains the actual code to be executed in the compiled binary. The text segment is a read-only segment and any user code is forbidden to modify it. Doing so can result in a crash of the program.
  • Data segment: This is further divided into subsections, that is, the initialized data segment and uninitialized data segment, which is historically known as Block Started by Symbol (BSS), and holds all global and static values declared in the program. Uninitialized values are initialized to zero when they are loaded into memory.
  • Stack segment: This segment is used to hold any local variables and the return addresses of functions. All resources whose sizes are known in advance and any temporary/intermediary variables that a program creates are implicitly stored on the stack.
  • Heap segment: This segment is used to store any dynamically allocated data whose size is not known up front and can change at runtime depending on the needs of the program. This is the ideal allocation place when we want values to outlive their declaration within a function.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset