Want more features on Pastebin? Sign Up, it's FREE!
Guest

SELF-PROPAGATING [HEAP MEMORY] CRAWLER in x86-64 Linux Assem

By: 0xACAB on Jan 1st, 2013  |  syntax: None  |  size: 15.28 KB  |  views: 7,394  |  expires: Never
download  |  raw  |  embed  |  report abuse  |  print
Text below is selected. Please press Ctrl+C to copy to your clipboard. (⌘+C on Mac)
  1. SELF-PROPAGATING [HEAP MEMORY] CRAWLER in x86-64 Linux Assembly
  2.  
  3. Over the past couple days I've been playing with x86-64 Linux Assembly and after the obligatory "Hello World!" program and some C compilation and disassembly, I wanted to do something a bit more challenging and fun. In the spirit of "you don't own a system until you control it", and with the idea of possible future usability, I decided on a little program that would copy itself to (heap) memory,
  4. direct execution there, and loop from there, writing itself below itself in higher addresses, then direct execution there, and so on.
  5.  
  6. I found lots of different things on the internet, but not too many x86-64 examples, and even less that wasn't either real simple stuff or too massive to follow for a newbie. So, maybe this helps someone else with a more hacky exercise. ;)
  7.  
  8. *** DISCLAIMER ***
  9. This is just a little training/learning exercise or proof of concept to see what I could do and if I could abuse/hack the heap this way that I thought would be of interest to others. You should only run this in the debugger. If you just let this run - especially if you remove the debug interrupt (INT 3) - you probably shouldn't be doing this exercise… I also realize this code is probably neither smart or optimal to anyone who really knows assembly. I sometimes had to research and try out different instructions, and I am happy enough it works! Finally, if you learn assembly in a proper school, your teacher would probably consider this to be an example of what NOT to do.
  10.  
  11.  
  12. WHAT YOU NEED
  13.  
  14. Everything you need should already be on your Linux x86-64 system (especially if you write code):
  15.  
  16. - nasm
  17. - ld
  18. - objdump
  19. - gdb
  20.  
  21. OK. So, I didn't want to use the stack and instead use a memory section that was read/write, and then copy/run from there. That means I need a variable in the .bss section. Mostly for me to find my way through memory, I am writing the contents of a string there defined in the .data section (which is read only), but in hindsight, in the finished code that's probably not even necessary. This is to get the address of the heap, and where we will start copying.
  22.  
  23. The following registers are used for:
  24.  
  25. - RBX: "instruction copy pointer", i.e. keeps track of where to copy from
  26. - RDI: "heap instruction pointer", i.e. keeps track of where copied instructions start
  27. - RDX: "heap copy pointer", i.e. keeps track of where the next byte needs to be copied
  28. - AL is used to copy the bytes
  29.  
  30. RCX is used in two different ways - just a feature, no particular reason - first in _start to copy the string into the heap, within the loop to count down the number of bytes to copy.
  31.  
  32. After figuring out where the instruction pointer is (RIP)+8 (we want to start copying at mov rdi, rdx), we copy the bytes one by one into the heap until the loop counter (RCX) is 0. The code then jumps to the first copied instruction on the heap (RDI), right after we set the instruction copy pointer to the start of the new code as well.
  33.  
  34. Here is the .asm code:
  35.  
  36. ****************************
  37.  
  38. section .data
  39.     string1 db  'ACABACAB',10,0
  40.  
  41. section .bss
  42.         memry_block resq 1
  43.  
  44. section .text
  45.     global _start
  46.  
  47. _start:
  48.        
  49.         int 3
  50.         int 3                           ; pause for debugger
  51.         lea rdi, [rel string1]          ; load address for our 'ACAB' into RDI
  52.         lea rdx, [rel memry_block]      ; load address of our heap variable into RDX
  53.         mov rcx, [rdi]                  ; move the 'ACAB' into RCX
  54.         mov [rdx], rcx                  ; move the 'ACAB' into the heap
  55.                                         ; 'ACAB' are just to start up - will copy itself... ;)
  56.  
  57.         xor rcx,rcx                     ; blank out RCX
  58.        
  59.  
  60. _writeheap:
  61.         lea rbx, [$+8]                  ; copy instruction pointer
  62.         mov rdi, rdx                    ; copy address of heap into heap instruction pointer  
  63.         mov rcx,0x2e                    ; countdown for 46 bytes copied (starting with the instruction above)
  64. _looper:        
  65.         mov al, [rbx]                   ; move itself byte by byte to AL
  66.         mov [rdx], al                   ; move our opcode from AL to heap
  67.         inc rbx                         ; set the "ip" one ahead
  68.         inc rdx                         ; set the heap pointer one ahead
  69.         dec rcx                         ; decrease counter by 1
  70.         cmp rcx,0x0                     ; check whether counter is 0 yet
  71.         jg _looper                      ; jump back to copier loop
  72.         mov rbx, rdi                    ; set the copy instruction pointer to start of copied code
  73.         int 3                           ; debug halt
  74.         jmp rdi                         ; jump to heap newly written code
  75.         nop
  76.         nop
  77.         nop
  78.         nop
  79.  
  80.         int 3
  81.         ; exit from the application here
  82.         xor     rdi,rdi
  83.         push    0x3c
  84.         pop     rax
  85.         syscall
  86.  
  87.  
  88. ***********************************
  89.  
  90. The system exit syscall is never reached. The NOPs are just for padding and are also never reached. There is an INT 3 right before the jump to copied code so the debugger halts and you can inspect registers, inspect the bytes written, and print a list of instructions. There are also a couple at the start, so you can do a 'disas' in gdb and see initial state. Obviously, don't remove the INT 3 inside the loop...
  91.  
  92. I had the hardest time figuring out how to do a "lea rbx,[$]" from the heap to get the instruction pointer, once I was there, because it kept referring back to the original read-only code in the .text block we get in the setup before the loop. The rest of the code still worked: executing going to the heap, copying the code into the next block, redirecting instruction there, and so on, but it would copy again and again the _original_ code, not the last copy.
  93.  
  94. I then realized that I didn't actually have to use the 'lea' instruction at all, as long as I set RBX to RDI before the jump but after the copy loop ends. That means that I needed to add a +8 byte offset (lea rbx, [$+8] ) so that the copying starts at mov rdi, rdx. Now, execution literally jumps from the .text segment to the top of the heap, copies itself below itself, jumps there, again and again, and therefore crawl all the way till the end of memory or some sort of memory fault is encountered. (I never got one, as I quit out of gbd before that). That's pretty cool… Well, at least I think so. :)
  95.  
  96.  
  97. You really have to run this to see what it does. Run the following commands (assuming you saved the above as crawler.asm). objdump gives us the opcodes and instructions (AT&T):
  98.  
  99. -----
  100.  
  101. root@:/mydev/assembly# nasm -f elf64 -o crawler.o crawler.asm
  102. root@:/mydev/assembly# ld -o crawler crawler.o
  103. root@:/mydev/assembly# objdump -D crawler
  104.  
  105. crawler:     file format elf64-x86-64
  106.  
  107.  
  108. Disassembly of section .text:
  109.  
  110. 00000000004000b0 <_start>:
  111.   4000b0:       cd 03                   int    $0x3
  112.   4000b2:       cd 03                   int    $0x3
  113.   4000b4:       48 8d 3d 55 00 20 00    lea    0x200055(%rip),%rdi        # 600110 <string1>
  114.   4000bb:       48 8d 15 5a 00 20 00    lea    0x20005a(%rip),%rdx        # 60011c <memry_block>
  115.   4000c2:       48 8b 0f                mov    (%rdi),%rcx
  116.   4000c5:       48 89 0a                mov    %rcx,(%rdx)
  117.   4000c8:       48 31 c9                xor    %rcx,%rcx
  118.  
  119. 00000000004000cb <_writeheap>:
  120.   4000cb:       48 8d 1c 25 d3 00 40    lea    0x4000d3,%rbx
  121.   4000d2:       00
  122.   4000d3:       48 89 d7                mov    %rdx,%rdi
  123.   4000d6:       48 b9 2e 00 00 00 00    mov    $0x2e,%rcx
  124.   4000dd:       00 00 00
  125.  
  126. 00000000004000e0 <_looper>:
  127.   4000e0:       8a 03                   mov    (%rbx),%al
  128.   4000e2:       88 02                   mov    %al,(%rdx)
  129.   4000e4:       48 ff c3                inc    %rbx
  130.   4000e7:       48 ff c2                inc    %rdx
  131.   4000ea:       48 ff c9                dec    %rcx
  132.   4000ed:       48 81 f9 00 00 00 00    cmp    $0x0,%rcx
  133.   4000f4:       7f ea                   jg     4000e0 <_looper>
  134.   4000f6:       48 89 fb                mov    %rdi,%rbx
  135.   4000f9:       cd 03                   int    $0x3
  136.   4000fb:       ff e7                   jmpq   *%rdi
  137.   4000fd:       90                      nop
  138.   4000fe:       90                      nop
  139.   4000ff:       90                      nop
  140.   400100:       90                      nop
  141.   400101:       cd 03                   int    $0x3
  142.   400103:       48 31 ff                xor    %rdi,%rdi
  143.   400106:       68 3c 00 00 00          pushq  $0x3c
  144.   40010b:       58                      pop    %rax
  145.   40010c:       0f 05                   syscall
  146.  
  147. Disassembly of section .data:
  148.  
  149. 0000000000600110 <string1>:
  150.   600110:       41                      rex.B
  151.   600111:       43                      rex.XB
  152.   600112:       41                      rex.B
  153.   600113:       42                      rex.X
  154.   600114:       41                      rex.B
  155.   600115:       43                      rex.XB
  156.   600116:       41                      rex.B
  157.   600117:       42 0a 00                rex.X or     (%rax),%al
  158.  
  159. Disassembly of section .bss:
  160.  
  161. 000000000060011c <memry_block>:
  162.  
  163. ------
  164.  
  165. Now run it in gdb. I set the disassembly-flavor to intel to match the format used in the assembly code, that's just a little easier. I then run the program and continue past the first INT 3 (for some reason I needed two for them to actual work at the start). If you want, you can check memory at the start of the program, of course. Here, we'll continue to the INT 3 in the first loop, while we're still in the original code in .text.
  166.  
  167. --------
  168.  
  169. root@:/mydev/assembly# gdb -q crawler
  170. Reading symbols from /mydev/assembly/crawler...(no debugging symbols found)...done.
  171. (gdb) run
  172. Starting program: /mydev/assembly/crawler
  173.  
  174. Program received signal SIGTRAP, Trace/breakpoint trap.
  175. 0x00000000004000b4 in _start ()
  176. (gdb) set disassembly-flavor intel
  177. (gdb) c
  178. Continuing.
  179.  
  180. Program received signal SIGTRAP, Trace/breakpoint trap.
  181. 0x00000000004000fb in _looper ()
  182. (gdb) disas
  183. Dump of assembler code for function _looper:
  184.    0x00000000004000e0 <+0>:     mov    al,BYTE PTR [rbx]
  185.    0x00000000004000e2 <+2>:     mov    BYTE PTR [rdx],al
  186.    0x00000000004000e4 <+4>:     inc    rbx
  187.    0x00000000004000e7 <+7>:     inc    rdx
  188.    0x00000000004000ea <+10>:    dec    rcx
  189.    0x00000000004000ed <+13>:    cmp    rcx,0x0
  190.    0x00000000004000f4 <+20>:    jg     0x4000e0 <_looper>
  191.    0x00000000004000f6 <+22>:    mov    rbx,rdi
  192.    0x00000000004000f9 <+25>:    int    0x3
  193. => 0x00000000004000fb <+27>:    jmp    rdi
  194.    0x00000000004000fd <+29>:    nop
  195.    0x00000000004000fe <+30>:    nop
  196.    0x00000000004000ff <+31>:    nop
  197.    0x0000000000400100 <+32>:    nop
  198.    0x0000000000400101 <+33>:    int    0x3
  199.    0x0000000000400103 <+35>:    xor    rdi,rdi
  200.    0x0000000000400106 <+38>:    push   0x3c
  201.    0x000000000040010b <+43>:    pop    rax
  202.    0x000000000040010c <+44>:    syscall
  203. End of assembler dump.
  204.  
  205. ----------
  206.  
  207. We're just about to jump to the heap (jump rid). All of our code should now be written there, and we can check with x/48xb $rdi (which should be the start of the code):
  208.  
  209. ---------
  210.  
  211. (gdb) x/48xb $rdi
  212. 0x60011c <memry_block>: 0x48    0x89    0xd7    0x48    0xb9    0x2e    0x00    0x00
  213. 0x600124 <memry_block+8>:       0x00    0x00    0x00    0x00    0x00    0x8a    0x03    0x88
  214. 0x60012c:       0x02    0x48    0xff    0xc3    0x48    0xff    0xc2    0x48
  215. 0x600134:       0xff    0xc9    0x48    0x81    0xf9    0x00    0x00    0x00
  216. 0x60013c:       0x00    0x7f    0xea    0x48    0x89    0xfb    0xcd    0x03
  217. 0x600144:       0xff    0xe7    0x90    0x90    0x90    0x90    0x00    0x00
  218.  
  219. ----------
  220.  
  221. Those are our opcode bytes. Here's a fragment of the objdump for easier comparison:
  222.  
  223. ---------
  224.  
  225.   4000d3:       48 89 d7                mov    %rdx,%rdi
  226.   4000d6:       48 b9 2e 00 00 00 00    mov    $0x2e,%rcx
  227. …..
  228.   4000f4:       7f ea                   jg     4000e0 <_looper>
  229.   4000f6:       48 89 fb                mov    %rdi,%rbx
  230.   4000f9:       cd 03                   int    $0x3
  231.   4000fb:       ff e7                   jmpq   *%rdi
  232.   4000fd:       90                      nop
  233. ……
  234.  
  235. ---------
  236.  
  237. If you run this the first time, you'll want to 'next i/n i' for the next couple of instructions to see execution jump to the heap, check the registers, see what changes, and watch the next couple of instructions get executed. Here we'll just continue, so we're right before the second jump. (Note that gdb has no idea anymore what the function might be called ;) ):
  238.  
  239. --------
  240.  
  241. (gdb) c
  242. Continuing.
  243.  
  244. Program received signal SIGTRAP, Trace/breakpoint trap.
  245. 0x0000000000600144 in ?? ()
  246. (gdb) x/16i $rip-24
  247.    0x60012c:    add    cl,BYTE PTR [rax-0x1]
  248.    0x60012f:    ret    
  249.    0x600130:    inc    rdx
  250.    0x600133:    dec    rcx
  251.    0x600136:    cmp    rcx,0x0
  252.    0x60013d:    jg     0x600129
  253.    0x60013f:    mov    rbx,rdi
  254.    0x600142:    int    0x3
  255. => 0x600144:    jmp    rdi
  256.    0x600146:    nop
  257.    0x600147:    nop
  258.    0x600148:    nop
  259.    0x600149:    nop
  260.    0x60014a:    mov    rdi,rdx
  261.    0x60014d:    movabs rcx,0x2e
  262.    0x600157:    mov    al,BYTE PTR [rbx]
  263. (gdb) i r
  264. rax            0x90     144
  265. rbx            0x60014a 6291786
  266. rcx            0x0      0
  267. rdx            0x600178 6291832
  268. rsi            0x0      0
  269. rdi            0x60014a 6291786
  270. rbp            0x0      0x0
  271. rsp            0x7fffffffe6e0   0x7fffffffe6e0
  272. rip            0x600144 0x600144
  273.  
  274. -----------------
  275.  
  276. You should be able to see from the memory addresses that we're now executing on the heap. You can also see the new copy of itself starting at 0x60014a. RAX (AL) still contains the last 0x90 (NOP). In the previous instruction RBX has been set to the start of the copy code, RCX (loop counter) is 0, and RDX points to where the next copy will be written. In the jump we start the instruction that sets RDI to RDX, so we can repeat this process.
  277.  
  278. If we press 'c' a couple of times, we see that we're moving down in memory (note the snip and jump in memory address):
  279.  
  280. ---------------
  281. ...
  282. (gdb) c
  283. Continuing.
  284.  
  285. Program received signal SIGTRAP, Trace/breakpoint trap.
  286. 0x00000000006002b4 in ?? ()
  287. (gdb) x/128i 0x60011c
  288.    0x60011c <memry_block>:      mov    rdi,rdx
  289.    0x60011f <memry_block+3>:    movabs rcx,0x2e
  290.    0x600129:    mov    al,BYTE PTR [rbx]
  291.    0x60012b:    mov    BYTE PTR [rdx],al
  292.    0x60012d:    inc    rbx
  293.    0x600130:    inc    rdx
  294.    0x600133:    dec    rcx
  295.    0x600136:    cmp    rcx,0x0
  296.    0x60013d:    jg     0x600129
  297.    0x60013f:    mov    rbx,rdi
  298.    0x600142:    int    0x3
  299.    0x600144:    jmp    rdi
  300.    0x600146:    nop
  301.    0x600147:    nop
  302.    0x600148:    nop
  303.    0x600149:    nop
  304.    0x60014a:    mov    rdi,rdx
  305.    0x60014d:    movabs rcx,0x2e
  306.    0x600157:    mov    al,BYTE PTR [rbx]
  307.    0x600159:    mov    BYTE PTR [rdx],al
  308.    0x60015b:    inc    rbx
  309.    0x60015e:    inc    rdx
  310.    0x600161:    dec    rcx
  311.    0x600164:    cmp    rcx,0x0
  312.    0x60016b:    jg     0x600157
  313.    0x60016d:    mov    rbx,rdi
  314.    0x600170:    int    0x3
  315.    0x600172:    jmp    rdi
  316.    0x600174:    nop
  317.    0x600175:    nop
  318.    0x600176:    nop
  319. ….
  320.    0x60028a:    nop
  321.    0x60028b:    nop
  322.    0x60028c:    mov    rdi,rdx
  323.    0x60028f:    movabs rcx,0x2e
  324.    0x600299:    mov    al,BYTE PTR [rbx]
  325.    0x60029b:    mov    BYTE PTR [rdx],al
  326.    0x60029d:    inc    rbx
  327.    0x6002a0:    inc    rdx
  328.    0x6002a3:    dec    rcx
  329.    0x6002a6:    cmp    rcx,0x0
  330.    0x6002ad:    jg     0x600299
  331.    0x6002af:    mov    rbx,rdi
  332.    0x6002b2:    int    0x3
  333. => 0x6002b4:    jmp    rdi
  334.    0x6002b6:    nop
  335.    0x6002b7:    nop
  336.    0x6002b8:    nop
  337.    0x6002b9:    nop
  338.    0x6002ba:    mov    rdi,rdx
  339.    0x6002bd:    movabs rcx,0x2e
  340.    0x6002c7:    mov    al,BYTE PTR [rbx]
  341.    0x6002c9:    mov    BYTE PTR [rdx],al
  342.    0x6002cb:    inc    rbx
  343.    0x6002ce:    inc    rdx
  344.  
  345. --------------
  346.  
  347. By now it should be clear that if you remove the INT 3 from the loop, this will just keep copying itself all the way through memory until….??? (not tested).
  348.  
  349. I enjoyed the hell out of this and was a lot more fun and instructive than reproducing a program that calculates some numbers from input, so hope this helps someone else starting out.
clone this paste RAW Paste Data