munmap madness

This post explores the possibilities arising from forcing free to unmap arbitrary regions of the address space and is part of the ptmalloc fanzine. While some interesting scenarios present themselves, I view this mostly as a curiosity, an educational foray into ptmalloc and the virtual memory manager subsystem of Linux. Kernel and glibc source links and statements like “the default stack size is 8MB” that are obviously platform-dependent all pertain to Ubuntu 16.04 on x86-64, unless stated otherwise. Ptmalloc and glibc malloc will be used interchangeably to refer to the malloc implementation in current glibc, while malloc in itself will refer to the malloc function.

Allocating mmapped chunks

Glibc malloc uses mmap directly in multiple cases:

It’s important to note how ptmalloc handles alignment requirements when using mmapped chunks. While mmap is guaranteed to return page-aligned regions, the user can request a greater alignment from glibc with memalign and friends. In this case an allocation with a worst case padding is obtained via _int_malloc so that a chunk of the requested size can be carved out at the required alignment boundary and returned to the user. This may mean wasted bytes at the beginning and end of the allocation, so the leading and trailing space is returned via free. However, if _int_malloc returns with an mmapped chunk, then simply the offset of the aligned chunk into the mmapped region gets stored in the prev_size field of the header. This enables free to find the beginning of the mapped region when called on the chunk (see below) while retaining support for platforms that cannot partially unmap regions (just a guess) and avoiding costly munmap calls.

Freeing mmapped chunks

__libc_free hands over chunks with the IS_MMAPPED bit set right to munmap_chunk, _int_free isn’t called in this case. munmap_chunk (abbreviated code below) only contains two integrity checks, some bookkeeping and the actual call to munmap. Note that the return value of munmap isn’t validated. The integrity checks ensure that the parameters passed into munmap are page-aligned but nothing more.

uintptr_t block = (uintptr_t) p - p->prev_size;
size_t total_size = p->prev_size + size;

if (__builtin_expect (((block | total_size) & (GLRO (dl_pagesize) - 1)) != 0, 0))
    malloc_printerr (check_action, "munmap_chunk(): invalid pointer", chunk2mem (p), NULL);

/* If munmap failed the process virtual memory address space is in a bad shape.  Just leave the block hanging around, the process will terminate shortly anyway since not much can be done.  */
__munmap ((char *) block, total_size);

This means that by corrupting the prev_size and size fields of a chunk and taking advantage of the way the beginning of the mmapped region is calculated, we can unmap an arbitrary memory range from the address space, assuming we know:

These would most realistically come from two leaks, the absolute address of the chunk and the absolute address of the target. Since munmap supports partial unmappings, we can also hit a single page of a mapping if needed.

Everything below is based on this primitive, even though some examples, for brevity, use munmap directly instead of emulating the corruption and subsequent free.

Why would I want to do that?

Fair question. Well of course to map something else in place of the old data, effectively arranging for a use-after-free via the dangling references to the unmapped region.

The virtual memory manager subsystem of linux

Since I’m out of my depth here, especially on the kernel side, this will only be a short, practical overview of the virtual address space of processes and some interesting special cases. Corrections or additions are highly welcome.

A great overview of how a program’s virtual address space looks like can be found here and additional details on the kernel side here. Let’s take a look at the following shamelessly stolen image:

address space

The 64-bit address space layout is very similar (for our purposes) but with much more entropy. An important case not shown here is PIE binaries. If the binary image itself is position independent, two things can happen:

See here for concrete examples. Kernels without this patch behave the first, newer kernels (e.g. the one in Ubuntu 16.04) the second way.

Some empirical observations about the mmap segment:

Let’s see this in practice, by continuously mmapping 32MB regions until we run out of address space. and observing when the ordering of the returned addresses change (switching to bottom-up upon reaching the bottom of the address space) or when consecutively returned address are non-contiguous (bumping into another region, e.g. the binary or the stack):

tukan@farm:/ptmalloc/madness$ ./exhaust
First mapping: 0x7fe81bbb8000
Non-contiguous mmap: 0x5578825b1000 after 0x557887bb8000
Non-contiguous mmap: 0x7fe81e1a9000 after 0x5b1000
Direction changed to upwards: 0x7fe81e1a9000 after 0x5b1000
Non-contiguous mmap: 0x7ffe5255d000 after 0x7ffe501a9000
Last address returned: 0x7ffffc55d000

The most important questions going forward are:

Let’s start with the second.


A few ways come to mind immediately:

void *dlo(const char* name) 
    void *handle = dlopen(name, RTLD_NOW);
    return handle;

int main(int argc, const char *argv[]) {
    const size_t size = 8*1024*1024;
    void *mm = 0;
    void **handles = 0;
    mm = mmap(0, size, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0)

    printf("mapped mm: %p-%p\n", mm, (char*)mm+size);
    printf("emulating the arbitrary unmap primitive by unmapping mm at %p\n", mm);  
    munmap(mm, size)

    printf("loading %x dynamic libraries\n", argc-1);
    for (int i = 1; i < argc; i++) {
        handles[i-1] = dlo(argv[i]);        
        // peek into the "opaque" return value of dlopen to get the loaded lib base
        // hackish and non-portable for sure?
        printf("\tloaded %s at %p\n", argv[i], (void*)*(uintptr_t *)handles[i-1]);
tukan@farm:/ptmalloc/madness$ ./dlopen /usr/lib/man-db/ /usr/lib/x86_64-linux-gnu/
mapped region: 0x7f44b5050000-0x7f44b5850000
emulating the arbitrary unmap primitive by unmapping mm at 0x7f44b5050000
loading 2 dynamic libraries
	loaded /usr/lib/man-db/ at 0x7f44b562e000
	loaded /usr/lib/x86_64-linux-gnu/ at 0x7f44b4f97000
pthread_t thrd;

// simply return an address from the thread stack
void *thrd_f(void *p) {
    size_t local = 1;

int main() {
    size_t size = 8*1024*1024;
    void *mm = NULL;
    void *thrd_local = NULL;

    mm = mmap(0, size, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0)

    printf("mapped mm: %p-%p\n", mm, (char*)mm+size);
    printf("unmapping mm then starting thread\n");
    munmap(mm, size)

    pthread_create(&thrd, NULL, thrd_f, 0)
    pthread_join(thrd, &thrd_local);
    printf("local variable at %p, %s\n", thrd_local, 
            ((char*)thrd_local < (char*)mm + size && 
             (char*)thrd_local >= (char*)mm) ? 
            "inside target region" : "outside target region");
tukan@farm:/ptmalloc/madness$ ./nptl_stack
mapped region: 0x7f36afad9000-0x7f36b02d9000
unmapping region then starting thread
local variable at 0x7f36b02d7f40, inside target region
tukan@farm:/ptmalloc/madness$ strace -e mmap ./nptl_stack
mmap(NULL, 8388608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3bb5e5a000
mapped region: 0x7f3bb5e5a000-0x7f3bb665a000
unmapping region then starting thread
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f3bb5e59000
local variable at 0x7f3bb6658f40, inside target region
+++ exited with 0 +++

Unmapping targets

As for unmapping targets, I’ve looked at the following:

Note that some items appear on both lists. Mixing and matching from the two lists hints at some fun but likely useless possibilities.

A special case: the main thread’s stack

The linux page fault handler has a special case for expanding the stack until its size reaches RLIMIT_STACK. Basically, any time a page fault happens at an address that’s below the beginning of the stack but within RSTACK_LIMIT, the stack is extended until the page containing the address. This makes it possible to unmap part of the stack and have the kernel magically remap it with a zero page upon the next access. After some experimentation and kernel source reading, it seems that every page of the stack, except the topmost, is fair game. My guess is that this is caused by the way vm areas are split by munmap but then again, out of my depth here.

The main_stack.c sample program demonstrates this behavior. It causes free to unmap the page containing the current stack frames, eventually leading to the ret instruction of munmap accessing the unmapped page, the kernel expanding the stack and the function returning to 0:

tukan@farm:/ptmalloc/madness$ ./main_stack
p: 0x55df0ec86020, target: 0x7ffe93e9c1c8
p->prev_size: 0xffffd5e07adea020, p->size: 0x2a1f85216fe2
Segmentation fault (core dumped)
tukan@farm:/ptmalloc/madness$ dmesg | tail -1
[106641.971062] main_stack[17695]: segfault at 0 ip           (null) sp 00007ffe93e9c150 error 14 in main_stack[55df0ce6a000+1000]

Of course this specific avenue of exploitation seems useless for multiple reasons, including stack cookie checks and the inability to map the null-page in any way, it just serves as an example. While I couldn’t come up with a generic way to leverage this behavior, it may open up some application-specific possibilities:

Crashes due to nullptr dereferences would likely present significant challenges, though.


Well, this turned out to be way longer than I intended and many details are still missing. I’ve originally completely dismissed this primitive as useless, in spite of coming across it multiple times while reading the ptmalloc code. After spending some time digging deeper, I won’t say that it’s broadly applicable but it’s definitely not useless.

Comments of any nature are welcome, hit me up on freenode or twitter.