Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Falkervisor Q&A
- brownie - Windows snapshotting user-mode based hypervisor fuzzer thingy (intel vt-x)
- - Debugger for Windows for snapshotting register and memory state of an application
- - Load up snapshot in a intel vt-x VM.
- - Modify memory and then execute
- brownie v2 - Windows ..... fuzzer thingy (amd svm)
- falkervisor - Cross-OS os-level fuzzer
- - Code coverage
- - Memory coverage (absolute and relative)
- - Debugging/single stepping
- - Networking
- - Snapshots
- - PXE boot falkervisor in snapshot mode
- - Load up boot sector on disk and boot under falkervisor
- - Monitor hardware breakpoints
- - If 0x133713371337 is present in a debug register upon a hardware breakpoint
- take a snapshot of register state and memory, and ship over the network.
- - Once the snapshot is done, the hypervisor resumes execution
- - Feel free to take more snapshots
- - Execution
- - PXE boot falkervisor in fuzz mode
- - falkervisor will request snapshot images over the network
- - falkervisor fuzz module will also pull whatever needed
- 10:
- - fuzzer must change memory to change input
- - the vm is launched!
- - vm ends exeuction on one of three conditions
- - timeout
- - i/o access (disk, display, context switch)
- - fault (memory violation, div by zero, etc)
- - #UD undefined opcode
- - #PF page fault
- - page faults occur A LOT under normal conditions
- NtPageFault - page fault handler
- - path for only 'true' page faults
- - #GP general purpose
- - on 64-bit bit systems, you get a #gp on non-canon memory accesses
- - 4141414141414141 - non-canon
- mov rax, 0x4141414141414141
- mov rdx, [rax] <- #gp
- - if the vm ended on a fault, the register state and input file are reported
- over the network
- - DIFFERENTIAL RESTORE!!! :)
- - Only restore pages in the VM with the dirty flag set.
- vvvv
- - [512 GB pages] -> [1GB pages] -> [2MB pages] -> [4k pages]
- ^ accessed ^ A ^A ^A ^D
- - 10-100x speedup
- - GOTO 10
- - Code coverage
- - falkervisor uses interrupt/timer based code coverage
- - LBR is a intel and AMD feature that basicially records last branches
- taken. Hence 'last branch recording'
- - LBR on intel stores the last 8(?) branches taken
- - LBR on AMD only stores the last branch taken
- br_from and br_to. [0x1000] -> [0x2000]
- - IBS instuction based sampling. AMD only
- - Give it a number of 'ticks'. It counts down these ticks, and then fires
- an interrupt after this counter hits zero.
- - Mainly for performance monitoring. Gives information on stalls, cache
- hits and misses, branch mispredictions, etc.
- - IBS for free tells you the physical and virtual addresses for RIP
- - IBS also tells you whether the instruction was a load or a store (or neither)
- as well as the physical AND virtual address of the load/store if it was one
- - Storage of code coverage
- - Initial falkervisor used bswap(br_to) ^ br_from
- 00001337, 00002335
- 37132335 <- code coverage 'hash'
- - Later falkervisor uses falkhash to properly hash (br_to, br_from)
- falkhash is a 128-bit AES based hashing algorithm which is super duper fast
- https://github.com/gamozolabs/falkhash
- - Use of code coverage
- - Each basic block has a counter associated with the number of times we've seen it
- - Early falkervisor
- - Sorted table of basic blocks based on frequency
- - Pick one of the least common 64 inputs, and use it as the base for the
- next generation of mutation
- - Later falkervisor, sorted database was ditched!
- - Randomly select n (~16) inputs from the code covearge database.
- - Out of the 16 inputs, pick the least common one
- - Use this input as the base of the next fuzz case
- - Crash coverage
- - Store each unique crash that I get
- - Initial falkervisor used unique PC
- - Later falkervisor used unique (PC, faulting address)
- - Later falkervisor used up to 10 unique faulting addresses for each PC
- - Bug shows up as null deref [0, 16KB) but then later in time shows up
- as a non-null deref.
- - 0x0, 0x10, 0x20, ... 0x3414141414141
- - Current falkervisor stores 10 of each of the 5 groups of crashes
- Classify the bug as one of five types of crashes.
- - Null deref - #PF [0, 16KB)
- - Negative deref - #PF [-16KB, 0]
- - Normal deref - #PF any other address
- - 'ascii' deref - #GP (non-canon memory access)
- - None of the above - !(#GP || #PF)
- - Similar to code coverage, randomly pick a crashing input to mutate with.
- - Picking how to pick input base
- - 5% Original input file
- - 5% Corpus of input files (thousands if not millions of input files)
- - 80% Code coverage inputs
- - 10% Crash coverage inputs
- - Weights!
- - How do I usually mutate
- - Corpus of inputs. Randomly pick data from the corpus, and inject it randomly
- in the base input file. Splicing inputs.
- - Minimization
- - Randomly delete parts/move parts/merge parts/change size of input file.
- - If it crashes in the same way as before, store this as the new 'minimal' input
- - branch 'solving'
- - Look for compare instructions that have input file data present in a register.
- cmp rax, rbx - rbx = 0x414141414141414141414141 <-- present in the input file
- ^ equal, not equal, off by one (above and below), etc
- movcc
- jmpcc
- cmp al, bl - al 0x41
- - memcmp solver
- - Comparing results from different snapshots
- - I have 8 NUMA nodes which each get their own snapshot
- - Stack corruption bug that would show up in thousands of ways.
- - crashing input hash '1337'
- - broadcast to all other nodes to run '1337' through.
- node 0 - lib+32
- node 1 - lib+32
- node 2 - lib+24
- node 3 - ~~~
- - '3333'
- node 0 - lib+64
- node 1 - lib+32
- node 2 - ~~~
- node 3 - lib+8
- 3333 and 1337 have lib+32 in common!
- [lib+32, lib+24, lib+64, lib+8, ~~~] -> [1337, 3333]
- - Faults of this concept
- - What if lib+32 is __stack_check(), DebugBreak(), RaiseException(), memcpy()
- - Workarounds to this fault, blacklist
- - Function flow [theory]
- - Lets say I have 2 crashes
- - minimize down the 2 crashes
- - Run each input through with full single stepping
- - What branches are taken, and what calls are made.
- - crash 1 - A() -> B() <- C() (crashes)
- - crash 2 - A() -> C() (crashes)
- - Make a set of all unique functions, and compare the sets.
- - 90% of functions match, assume the bugs are the same
- - A -> B -> C -> memcpy
- - A -> D -> F -> memcpy 2 differnt bugs
- - Logic bugs
- - Speical tracing/breakpoints. LoadLibrary()
- - Put a breakpoint on LoadLibrary(), and see if user input is present in
- the file name.
- - MmProbeAndLockPages(), the address is a user address, but access mode is KernelMode
- - Little things like this. Usually going to have to be be manually implemented as
- plugins/modules.
- - Memory covearge (relative and absolute)
- - IBS I get free load/store decodes
- - I can track what memory is being written and read from
- - I can track what blocks are making these accesses
- - relative memory coverage by using stack/heap awareness.
- - mov rax, [0x100+0x100] <- heap address is 0x100
- - mov rax, [0x200+0] <- heap address is 0x200
- - 0x100 belongs to allocation @ 0x100 of 0x20 length.
- - so this access is 0x0 bytes relative to the access
- - what if this address faults, and is out of bounds of our heap info, how can we tell?
- - page heap. make it so it would fault, and now we have the crash we wanted
- - how does this evolve?
- - mov rax, [rbx] (100 times rbx = +0x10, 10 times = +0x20, 1 time = +0x1000)
- - Stack walking
- - 'kb' or 'bt' in windbg/gdb
- - With code covearge, store the stack walk that caused us to get here
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement