zero_to_hero
Last updated
Last updated
In one sentence, this exploit is tcache poison triggered by a poison null byte to gain arbitrary write.
First, we will go over the 5 types of bins that the heap manager uses. This information is based on azeria-labs.com (Archive).
Internally, the heap manager needs to keep track of freed chunks so that malloc
can reuse them during allocation requests. A naive approach would be to store all the freed chunks in a long list. While this would work, it would make malloc
slow. Since malloc is a high-utilization component of most programs, this slowness would have a huge impact on the overall performance of programs running on the system. To improve performance, the heap manager instead maintains a series of lists called "bins", which are designed to maximize speed of allocations and frees.
There are 5 type of bins: 62 small bins, 63 large bins, 1 unsorted bin, 10 fast bins and 64 tcache bins per thread.
The small, large, and unsorted bins implement the basic recycling strategy of the heap. The fast bins and tcache bins are optimizations that layer on top of these.
Confusingly, the small, large, and unsorted bins all live together in the same array in the heap manager’s source code. Index 0 is unused, 1 is the unsorted bin, bins 2-64 are small bins and bins 65-127 are large bins.
Each small bin stores chunks that are all the same fixed size. Every chunk less than 512 bytes on 32-bit systems (or than 1024 bytes on 64-bit systems) has a corresponding small bin.
For chunks over 512 bytes (1024 bytes on 64-bit), the heap manager instead uses "large bins". Each of the 63 "large bins" operates mostly the same way as small bins, but instead of storing chunks with a fixed size, they instead store chunks within a size range. Given a chunk’s size, there is exactly one small bin or large bin that this size corresponds to.
Because large bins store a range of sizes, insertions onto the bin have to be manually sorted, and allocations from the list require traversing the list. This makes large bins inherently slower than their small bin equivalents.
Instead of immediately putting newly freed chunks onto the correct bin, the heap manager coalesces it with neighbors, and dumps it onto a general unsorted linked list. During malloc, each item on the unsorted bin is checked to see if it “fits” the request. If it does, malloc can use it immediately. If it does not, malloc then puts the chunk into its corresponding small or large bin.
These bins essentially keep recently released small chunks on a “fast-turnaround queue”, intentionally keeping the chunk live and not merging the chunk with its neighbors (heap manager doesn’t set the “P” bit at the start of the next chunk) so that it can be immediately repurposed if a malloc request for that chunk size comes in very soon after the chunk is freed.
Like small bins, each fast bin is responsible only for a single fixed chunk size. There are 10 such fast bins, covering chunks of size 16, 24, 32, 40, 48, 56, 64, 72, 80, and 88 bytes plus chunk metadata.
Since fast-binned chunks are never merge candidates, they can also be stored in singly-linked lists, rather than needing to be on doubly linked lists so that they can be removed from a list if the chunk gets merged.
Per-thread caching speeds up allocations by having per-thread bins of small chunks ready-to-go. That way, when a thread requests a chunk, if the thread has a chunk available on its tcache, it can service the allocation without ever needing to wait on a heap lock. Only one thread can access the heap at a given time. By default, each thread has 64 singly-linked tcache bins. Each bin contains a maximum of 7 same-size chunks ranging from 24 to 1032 bytes on 64-bit systems and 12 to 516 bytes on 32-bit systems.
When a chunk is freed, the heap manager sees if the chunk will fit into a tcache bin corresponding to the chunk size. Like the fast-bin, chunks on the tcache bin are considered “in use”, and won’t be merged with neighboring freed chunks.
If the tcache for that chunk size is full (or the chunk is too big for a tcache bin), the heap manager reverts to our old slow-path strategy of obtaining the heap lock and then processing the chunk as before.
Corresponding tcache allocations are also pretty straightforward. Given a request for a chunk if a chunk is available on an appropriate tcache bin, the heap returns that chunk without ever obtaining the heap lock. If the chunk is too big for the tcache, we also continue as before.
You should understand the basics of the heap before proceeding. I recommend reading the "Heap overflow" writeup and the "Heap Basics" section of this post from devel0pment.de (Archive).
For an overview and step-by-step process of how malloc
and free
operate in the in the glibc heap implementation visit this post from azeria-labs.com (Archive).
Now you're really cooking. Can you pwn this service?. Connect with nc 2019shell1.picoctf.com 49928. libc.so.6 ld-2.29.so
For this stage, we need to bypass libc 2.29's tcache double free protection using a poison null byte in order to perform a double free attack.
Steps:
Reverse the binary file using Ghidra (cheat sheet). main()
function:
do_remove()
function:
do_add()
function:
win()
function:
This is a hidden win()
function that is referenced nowhere else in the program. It simply prints out the flag from the file flag.txt
. This is the function we want to call.
Ghidra also tells us the address of win()
to be 0x00400a02
:
With the malloc
s everywhere, this is a heap exploitation problem, so it's time to look for heap exploits.
Bug 1: Pointers not cleared in deletion (do_remove()
function). The pointers stay even after a chunk gets freed, which means we can, in theory, free a chunk twice to trigger double-free. However, simply attempting to free power 0, free power 1, and free power 0 again will fail because we are using libc 2.29. Trying this will display the following error message:
Bug 2: Off-by-One Error. When reading into the text for the superpower, it lets us write tons of characters, and then it will append a null byte to the end of whatever we have written. However, if we asked for 40
bytes to write in, then we will get to write 40
characters, and then the null byte will be placed outside of the current chunk.
Using bug 2 we can bypass libc 2.29's tcache double free protection and then abuse bug 1 to trigger a double free.
Important notes about tcache:
First, there are no checks when tcache returns a pointer to malloc. If we can corrupt the tcache, then when malloc asks for a chunk of some size, malloc will simply let us write with whatever corrupted pointers are stored in tcache.
Second, the only protection that tcache has against double-free is that it makes sure the current chunk being freed is different from all chunks freed earlier of the same size.
By using bug 2 to place a null character into the next chunk, we can effectively change the size of the next chunk.
As we can see the lowest byte of the size+flags
field of chunk2 will be overwritten when we overflow chunk1 with a single null-byte. In this case the value 0x31, which represents the size of chunk2 (0x30) and the enabled PREV_INUSE
flag (0x1), would be set to 0x00. This will likely crash the program, because the heap-metadata are not valid anymore (a chunk cannot have a size of 0x00). But let’s consider what is happening, if chunk2 has a size of 0x100:
When we are overflowing chunk1 now, the size of chunk2 (0x100) is not altered, since it is only stored within the second lowest byte. The only thing being altered is the PREV_INUSE
flag, which is set from 0x1 to 0x0. This indicates the the previous block is unallocated. However, this is not what we want to do because blocks in tcache lists are not consolidated. So, changing the PREV_INUSE
flag has no impact.
Instead, lets create a first chunk (chunk1) and then a second chunk (chunk2) below it with a size of 0x110 (or 0x111 with the PREV_INUSE
bit set). If we free chunk2 now, tcache will see that it has size 0x110 (ignoring the PREV_INUSE
bit) and then store a pointer to chunk2 inside of the tcache linked list of freed chunks of size 0x110.
Now, suppose we write into chunk1 and completely fill it. The program will append the null byte, as discussed in "Bug 2", which overflows into the next chunk. Before writing the null byte, the heap looks like this:
And now the program writes in the null byte, so the heap looks like this:
We've overwritten the size metadata for chunk2. So if we free chunk2 again, tcache will store the second chunk inside of the linked list of freed chunks of size 0x100 because that's what the chunk says its size is. We have now successfully bypassed the tcache check to obtain a double-free, once for freeing into the 0x110 linked list, and a second time for freeing into the 0x100 linked list.
For this stage, we will perform the following steps:
Use the vulnerability to insert an address of our choice as a free chunk into a bin/fastbin.
Allocate a chunk which fits the size of this bin/fastbin. The allocation will return the inserted address (+ 0x10).
Write data of our choice into the allocated chunk. This will end up at the inserted address (+ 0x10).
We simply want to overwrite some function pointer and do not need to write a huge amount of data, so the size of a fastbin will suffice. Also, it is import to realize that fastbins only use the forward pointer.
Steps:
Let’s begin by determining how we can insert an address to a fastbin. A fastbin is a singly linked list, whose head pointer is stored in the main arena. This head pointer references the first free chunk. The address of the second free chunk is stored in the FD of the first free chunk and so forth. If we manage to overwrite the FD of a free chunk, we effectively add a free fakechunk to the fastbin:
On the next allocation the free chunk at the head of the fastbin is returned:
The fakechunk is now the new head of the fastbin and is thus returned on the following allocation:
Before we overwrite an area of memory, we need to figure out which area to overwrite. Since the program intentionally leaks a libc address, we probably have to overwrite something there. The pointer __free_hook
, stored in a writable area of libc, simply redirects the actions of free()
to whatever function __free_hook
happens to point to, so rewriting __free_hook
will make free()
call an arbitrary function. Lets overwrite __free_hook
with the win()
function mentioned at the beginning.
We can use the double free from "Stage 1" to overwrite the FD of an unallocated fastbin chunk. We malloc
a chunk of size 0x100 and overwrite the forward address to __free_hook
. This removes chunk B from the 0x100 list, so we cannot free it again from that list to access our overwritten forward pointer.
However, since we double-freed block B, it still exists in the 0x110 list. We create a chunk of the same size as chunk B, thus returning to us chunk B from tcache. Now the head pointer (which always points to the first block) of the 0x110 list points to the forward address of block B, which we overwrote to __free_hook
. The block in the 0x100 list refers to the same memory location as the block in the 0x110 list. More info about the fastbin double free vulnerability at heap-exploitation.dhavalkapil.com (Archive)
Now that the next block points to our overwritten memory location, we malloc
it. malloc
sees our request for a block of memory with size 0x110 and returns the pointer to the next block, which we overwrote to __free_hook
. malloc
then asks what value we want to store in this "block of memory" (which is actually __free_hook
). We tell is to store the address of win()
. We have successfully written over an address of our choosing with an arbitrary address. malloc
believes we simply asked for a block of memory, it gave us one, and then we stored our data in it. You can think of this as malloc
being oblivious to our attack.
Run the script.py python2 script.py
and get the flag:
picoCTF{i_th0ught_2.29_f1x3d_d0ubl3_fr33?_jmvqjcus}