34C3 CTF nope writeup - diluted shellcodes
06 Jan 2018A quick writeup for nope from 34C3.
Overview
The challenge was an x64 Linux binary with a simple premise, it reads in and executes our shellcode. But there’s a twist:
- first it dilutes our code by inserting 15 bytes of
\x90
(nops) after every byte of our input. This messes up multi-byte instructions and limits what we can use to 256 instructions with fixed arguments. - then it unmaps almost everything from the address space and mprotects the rest as read-only (including the stub doing this unmapping/mprotecting)
Available opcodes
First I generated all the available opcodes with disasm
from pwntools, filtered out the bad ones, created a python dict and manually removed some that are privileged or otherwise obviously useless for us. The result can be found here.
What can immediately be seen is that there’s no way we can make a syscall with just these opcodes as there’s no syscall
, sysenter
or int 0x80
.
Execution environment
When our code starts running, we have rsp
set to an RW page and rcx
is set to the address of the region containing our code, which also happens to be the only thing executable in the address space (well, also vsyscall, but that’s not going to help here). The shellcode is prepended with the unmapping/mprotecting stub, then an exit_group syscall invocation is added at the end, followed by nops until the end of the vm region. Both of these are separated by a random number of nops from our code, and the final mprotect call of the stub makes the stub itself read-only. So we are left with our diluted input and the final exit_group syscall. This, combined with the fact that we cannot inject syscall instructions means we have to solve the challenge by hijacking the syscall instruction meant to execute exit_group and invoke an execve
with the right arguments.
Building blocks
To do that, we need to build more useful primitives from the limited instruction set we can use.
Emulating mov
There are single-byte push/pop instructions for the 8 registers extended from x86, these can be used to emulate mov.
Arithmetic
We have a limited set of arithmetic operations (see below) but they are enough to build any value in eax/al.
Setting the carry flag, then adding 0x90 to al with carry, then subtracting 0x90 without carry will add 1 to al. The same can be done in eax but this will zero the higher double word of rax.
Writing strings
There are convenient stos instructions to use that can be combined with the previous arithmetic primitive to buld arbitrary strings on the stack.
Building arbitrary 64-bit values
As mentioned previously, 32-bit arithmetic will zero out the higher dw, so creating arbitrary 64-bit values is a bit more tricky. I started working on this primitive by using 32-bit arithmetic, writing them to the stack in two part with stos DWORD PTR es:[rdi],eax
and then getting them via pop but realized that it’s not necessary for a functioning exploit.
Finding the executable syscall inst
There’s a random amount of nop padding to make finding the syscall instruction harder. We can locate it using an available scas
instruction.
The exploit
The final, commented exploit puts these building blocks together to invoke an execve syscall with ['/bin/sh\0', '-c\0', cmd + '\0']))
as the argv array.