C++ Memory Management – Understanding Memory Segmentation: Stack, Heap, etc.

cmemory

So memory segmentation can be done with or without paging. I always hear people talking about stack and heaps when discussing something memory related in C++. However, what I do not get is that if the program's memory (its virtual address space) is segmented using pages, it won't get divided into stack, heap, etc., right? But rather it is divided into pages without names – in that case (with the assumption that my understanding of segmentation is correct), can you still talk about the stack and heap?

Also, if the virtual address space is divided into pages, where does the stack, heap, data, etc. segments come from?

Best Answer

The stack and heap in C/C++ describe different mechanisms of memory allocation. They can also be called “automatic storage” and “free store”. If you allocate data on the free store/heap, you are responsible for managing the lifetime (calling delete or free()). This is mostly unrelated to memory pages.

Memory pages are a block of virtual addresses. The virtual address space of a process is created by the operating system by mapping pages into the address space.

A process uses different areas of the address space differently. One area will be the stack. Other areas will hold the contents of executable files and libraries. These files may have different segments, e.g. for executable code, for constants, and space for variables.

Here I've pulled the address space mappings of a Perl interpreter using pmap:

0000000000400000   1776K r-x-- perl
00000000007bb000      4K r---- perl
00000000007bc000     12K rw--- perl
00000000007bf000      4K rw---   [ anon ]
0000000001eff000   1192K rw---   [ anon ]
00007f00184b7000   4464K r---- locale-archive
00007f0018913000   1792K r-x-- libc-2.23.so
00007f0018ad3000   2048K ----- libc-2.23.so
00007f0018cd3000     16K r---- libc-2.23.so
00007f0018cd7000      8K rw--- libc-2.23.so
00007f0018cd9000     16K rw---   [ anon ]
00007f0018cdd000     36K r-x-- libcrypt-2.23.so
00007f0018ce6000   2044K ----- libcrypt-2.23.so
00007f0018ee5000      4K r---- libcrypt-2.23.so
00007f0018ee6000      4K rw--- libcrypt-2.23.so
00007f0018ee7000    184K rw---   [ anon ]
00007f0018f15000   1056K r-x-- libm-2.23.so
00007f001901d000   2044K ----- libm-2.23.so
00007f001921c000      4K r---- libm-2.23.so
00007f001921d000      4K rw--- libm-2.23.so
00007f001921e000     12K r-x-- libdl-2.23.so
00007f0019221000   2044K ----- libdl-2.23.so
00007f0019420000      4K r---- libdl-2.23.so
00007f0019421000      4K rw--- libdl-2.23.so
00007f0019422000     96K r-x-- libpthread-2.23.so
00007f001943a000   2044K ----- libpthread-2.23.so
00007f0019639000      4K r---- libpthread-2.23.so
00007f001963a000      4K rw--- libpthread-2.23.so
00007f001963b000     16K rw---   [ anon ]
00007f001963f000    152K r-x-- ld-2.23.so
00007f0019839000     20K rw---   [ anon ]
00007f0019864000      4K r---- ld-2.23.so
00007f0019865000      4K rw--- ld-2.23.so
00007f0019866000      4K rw---   [ anon ]
00007ffc0cd0a000    136K rw---   [ stack ]
00007ffc0cd4e000     12K r----   [ anon ]
00007ffc0cd51000      8K r-x--   [ anon ]
ffffffffff600000      4K r-x--   [ anon ]
 total            21284K

Note that the smallest size is 4K, that is the page size on my system.

We can see on the rightmost column which files (executables or libraries) were mapped into the address space at which offset. There are also a couple of special regions, such as [stack]. Some of the anonymous regions can be used as a free store/heap. There may be gaps between the mapped ranges of the address space. Trying to access memory in an unmapped range will cause a segfault.

Each of those regions consists of one or more pages. This is important, because pages can have access protections: the executables and libraries provide executable pages for code, read-only pages for constants, and read-write pages for variables. This is a security mechanism to avoid arbitrary data from being executed (though this doesn't matter very much for an interpreter). The pages for the heap and stack will be readable and writeable, but not executable.

An address range may be mapped, but the page for that address might not exist at the moment. Using such an address will trigger a page fault. The operating system can intercept the page fault and add the page. For example, not all pages of the 136K for the stack may exist when the process starts. Instead, the pages are added on demand. Or, a page may have been swapped out to disk. A page fault will cause that page to be loaded back into physical memory.