DTORS SECURITY RESEARCH (DSR) Author: mercy Title: Reconstructing binaries to C for beginners. Date: 20/01/2003 An example of reverse engineering code: Hopefully at the end of this small disassembly dump, we will have been able to analyze what is happening in the assembly, and construct a roughly accurate code to go with it. mercy@hewey:~/examples/gdb > gdb 1 -q (gdb) disass main Dump of assembler code for function main: 0x8048410
: push %ebp 0x8048411 : mov %esp,%ebp 0x8048413 : sub $0x18,%esp 0x8048416 : nop 0x8048417 : movl $0x0,0xfffffffc(%ebp) 0x804841e : mov %esi,%esi 0x8048420 : cmpl $0x9,0xfffffffc(%ebp) 0x8048424 : jle 0x8048428 0x8048426 : jmp 0x8048440 0x8048428 : add $0xfffffff4,%esp 0x804842b : push $0x80484b4 0x8048430 : call 0x8048300 0x8048435 : add $0x10,%esp 0x8048438 : incl 0xfffffffc(%ebp) 0x804843b : jmp 0x8048420 0x804843d : lea 0x0(%esi),%esi 0x8048440 : xor %eax,%eax 0x8048442 : jmp 0x8048444 0x8048444 : mov %ebp,%esp 0x8048446 : pop %ebp 0x8048447 : ret End of assembler dump. (gdb) 0x8048410
: push %ebp Push the ebp (extended base pointer) onto the stack. 0x8048411 : mov %esp,%ebp Move the current esp (extended stack pointer) into the old ebp register, in turn setting up a new bp. 0x8048413 : sub $0x18,%esp Set aside 24 bytes for local variables. 0x8048416 : nop Do nothing. ( no operation ) 0x8048417 : movl $0x0,0xfffffffc(%ebp) put 0 at -4 ebp, or four bytes from the ebp. 0x804841e : mov %esi,%esi two byte padding, next instruction main+14+2. 0x8048420 : cmpl $0x9,0xfffffffc(%ebp) compare long -4 ebp, with the value 9. 0x8048424 : jle 0x8048428 if it is lower than 9, jump to adress 0x8048428 0x8048426 : jmp 0x8048440 if it is equal, then jump to 0x8048440 0x8048428 : add $0xfffffff4,%esp if it was lower than 9, subtract 12 from esp (stack setup) 0x804842b : push $0x80484b4 push %s\n\000 onto the stack 0x8048430 : call 0x8048300 make a call to printf 0x8048435 : add $0x10,%esp change the stack pointer 0x8048438 : incl 0xfffffffc(%ebp) increment long, -4 ebp ( +1 to the value at ebp) 0x804843b : jmp 0x8048420 go back to main+16 0x804843d : lea 0x0(%esi),%esi load effective adress of esi into esi. (I beleive it is 2 byte padding again, I may be wrong though.) 0x8048440 : xor %eax,%eax xor the value in eax with itself, in turn putting 0 in eax, this will be the exit status. 0x8048442 : jmp 0x8048444 jump to adress 0x8048444 0x8048444 : mov %ebp,%esp move the base poiter into the stack pointer. 0x8048446 : pop %ebp pop the inital ebp value. (0x8048410) 0x8048447 : ret return. From that very basic assembly dump, we were able to construct a fairly accurate image of what the programs source may have been. Lets start with putting the info together. put a null byte at -4 ebp, or four bytes from the ebp. <== that tells me that there is one size int variable been set aside. compare long -4 ebp, with the value 9. <== compare the value in the int with 9, if it isnt jump here: if it is continue here: push %s\n\000 onto the stack <== shows me it is just going to print a random string. make a call to printf <== obviously, printf("%s\n"); (from the above) increment long, -4 ebp ( +1 to the value at ebp) <== if the value is not 9, print the line above, add 1 to the value. go back to main+16 <== go back to compare long (compare ebp with $0x09). xor the value in eax with itself, in turn putting 0 in eax, this will be the exit status. <== tells me to exit with status(0) (success). That above is the basis of our re-construction code, there are a few other values in there, though the above is what info we really need to be able to re-construct the binary into C format, here is my attempt: #include int main(int argc, char **argv) { int a; // int for(a = 0; a <=9; a++) // compare long loop, increment a; printf("%s\n"); // print the pushed value %s\n\000 return(0); // return } Above is my reconstructed C code, below is the actual program: #include int main(int argc, char **argv) { int i; for(i = 0; i < 10; i++) printf("%s\n"); return(0); } As you can see, there is basically no difference besides the name of local variables, re-construction of binaries can be very complex for large programs, thats why i suggest if you think you have found a hole, re-construct the vuln function rather than the whole binary. And that is just about a rap, any questions feel free to e-mail me at: mercy@dtors.net /* Some people may be wondering where we got our pushed values; %s\n\000, you simply do this on the push value: (gdb) x/5bc 0x80484b4 0x80484b4 <_IO_stdin_used+4>: 37 '%' 115 's' 10 '\n' 0 '\000' Cannot access memory at address 0x80484b8 (gdb) and that is how I got what i needed, have fun and until next time, peace */ Paper Two: Reverse engineering code, a function? a function? a function? sheeeeesh.