Anonymous mmap zero-filled

freebsdmemorymmap

In Linux, the mmap(2) man page explains that an anonymous mapping

. . . is not backed by any file; its contents are initialized to zero.

The FreeBSD mmap(2) man page does not make a similar guarantee about zero-filling, though it does promise that bytes after the end of a file in a non-anonymous mapping are zero-filled.

Which flavors of Unix promise to return zero-initialized memory from anonymous mmaps? Which ones return zero-initialized memory in practice, but make no such promise on their man pages?

It is my impression that zero-filling is partially for security reasons. I wonder if any mmap implementations skip the zero-filling for a page that was mmapped, munmapped, then mmapped again by a single process, or if any implementations fill a newly mapped page with pseudorandom bits, or some non-zero constant.

P.S. Apparently, even brk and sbrk used to guarantee zero-filled pages. My experiments on Linux seem to indicate that, even if full pages are zero-filled upon page fault after a sbrk call allocates them, partial pages are not:

#include <unistd.h>
#include <stdio.h>

int main() {
  const intptr_t many = 100;
  char * start = sbrk(0);
  sbrk(many);
  for (intptr_t i = 0; i < many; ++i) {
    start[i] = 0xff;
  }
  printf("%d\n",(int)start[many/2]);
  sbrk(many/-2);
  sbrk(many/2);
  printf("%d\n",(int)start[many/2]);
  sbrk(-1 * many);
  sbrk(many/2);
  printf("%d\n",(int)start[0]);
}

Best Answer

It's hard to say which ones promise what without simply exhaustively enumerating all man pages or other release documentation, but the underlying code that handles MAP_ANON is (usually? always?) also used to map in bss space for executables, and bss space needs to be zero-filled. So it's pretty darn likely.

As for "giving you back your old values" (or some non-zero values but most likely, your old ones) if you unmap and re-map, it certainly seems possible, if some system were to be "lazy" about deallocation. I have only used a few systems that support mmap in the first place (BSD and Linux derivatives) and neither one is lazy that way, at least, not in the kernel code handling mmap.

The reason sbrk might or might not zero-fill a "regrown" page is probably tied to history, or lack thereof. The current FreeBSD code matches with what I recall from the old, pre-mmap days: there are two semi-secret variables, minbrk and curbrk, and both brk and sbrk will only invoke SYS_break (the real system call) if they are moving curbrk to a value that is at least minbrk. (Actually, this looks slightly broken: brk has the at-least behavior but sbrk just adds its argument to curbrk and invokes SYS_break. Seems harmless since the kernel checks, in sys_obreak() in /sys/vm/vm_unix.c, so a too-negative sbrk() will fail with EINVAL.)

I'd have to look at the Linux C library (and then perhaps kernel code too) but it may simply ignore attempts to "lower the break", and merely record a "logical break" value in libc. If you have mmap() and no backwards compatibility requirements, you can implement brk() and sbrk() entirely in libc, using anonymous mappings, and it would be trivial to implement both of them as "grow-only", as it were.

Related Topic