;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; ;; BrandoVX's Win32 Assembly Guide ;; ;; ;; ;; ;; Hello. This article is an assembly guide for windows 32 platforms. There are far too many DOS assembly guides compared to the amount of good win32 assembly tutorials. Therefore, I am writing this to give you the abi- lity to follow my function hijacking article. This article assumes knowledge of computers, memory addressing, and data sizes (word, dword, etc.) Also, a pre-knowledge of DOS assembly would be helpful, but not necessary (this guide moves fast.) Also, this article assumes knowledge of hexadecimal notation. Finally, any code examples will be written in MASM/TASM style. Enough of that lets begin. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Registers ;; ;; ;; ;; ;; Registers are the most basic of storage units in your processor. For many operations and variables, you can simply use the registers to hold your data. In 32-bit processors, each register is 32 bits in size (hence the name Win32.) In the x86 32-bit processor, there are 16 registers. Here, we will focus on the most used. -- General Purpose -- First off, there are the general purpose registers. These four reg- isters are for general use. They are as follows: EAX, EBX, ECX, EDX (we will include EBP for the purposes of the next tutorial) Each of these registers are 32-bits in size. For those familiar wi- th 16-bit programming are probably saying "what about AX?" AX still exists. Each register has the ability to be used in 32-bit, 16-bit, and 8-bit div- isions. Here is an illustration: AX---------------16 bits (2 bytes) EAX------------------------------32 bits (4 bytes) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; ;; ;; ;; ;; byte ;; byte ;; byte ;; byte ;; ;; 8 ;; 8 ;; ;; ;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; AH-----8 AL-----8 (1 byte each) As you can see EAX is the full blown 32 bit, 4 byte, double word, register. AX is half of the register. 16 bits, 2 bytes or one word of the lower half of EAX. AH is the upper byte of AX and AL is the lower byte. Th- ese same rules apply to all registers. Now let?s look at a realistic examp- le. EAX = 10203040h AX = 3040h AH = 30h AL = 40h The above is pretty explanatory. It just shows what each register would hold if used. -- Index Registers -- This family of register is very important in coding, optimizing and string operations. The two registers in this class are ESI and EDI. These two are generally used to read from and write to memory addresses and use special optimizing instructions built into the CPU. These are often used in conjunction with the [ ] command. The brackets mean we want the contents of the address the variable is. Here is an example: So, for example, let's say ESI = 0040100Fh. ESI = the number 0040100Fh (obviously) [ESI] = the contents of the memory address 0040100Fh Note: All of the above works with all registers. There are several optimizations that work with ESI, but I won't co- ver those until later. -- Stack Pointer -- ESP is the stack pointer. It is controlled by the processor automa- tically (sort of, you'll see later), but also can be changed by the user. Without getting too far into the concept of the stack (I'll explain that later), I will just say the following. The stack is a temporary area that data can be moved onto (pushed) and moved off (popped). As data is pushed onto the stack, the ESP register will move. Same for when data is popped off the stack. Also ESP will always point to the data on the top of the stack. -- Instruction Pointer -- The instruction pointer register (EIP) contains the address of the instruction to be executed. Every time an instruction is executed, EIP is incremented to the next instruction. You cannot change the contents of this register manually. Instead, there are several instructions that control the value of EIP. -- Flags Register -- This register holds many flags that are triggered on different eve- nts. The flags are as follows: Bit Label Description --------------------------- 0 CF Carry flag 2 PF Parity flag 4 AF Auxiliary carry flag 6 ZF Zero flag 7 SF Sign flag 8 TF Trap flag 9 IF Interrupt enable flag 10 DF Direction flag 11 OF Overflow flag 12-13 IOPL I/O Privilege level 14 NT Nested task flag 16 RF Resume flag 17 VM Virtual 8086 mode flag 18 AC Alignment check flag (486+) 19 VIF Virtual interrupt flag 20 VIP Virtual interrupt pending flag 21 ID ID flag Those are all of the flags, but there are only a few we will conce- rn ourselves with. The first is the Zero Flag (ZF). This flag is set when a 0 is encountered (something equals zero, a zero is transferred, etc). The carry flag (CF) is also used, but not nearly as often as the ZF. The CF is triggered when a register overflows. Some functions trigger this if an err- or has occurred. The others don't get tripped often or at all, in the case of my function hijacker. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Instructions ;; ;; ;; ;; ;; Now that you know what the registers are, you are probably wonderi- ng "well, what do I do with these?" Your answer is instructions. Instructi- ons are the base of the CPU. In a sentence, instructions change the regist- ers and move data around. I won't show all of the instructions here, I will only show the ones used the most and the ones I use in the next tutorial. Before I tell you what the instructions are, I will briefly explain how your computer interprets them. First off, your compiler reads the inst- ruction mnemonics and turns them into bytes. Every mnemonic has a correspo- nding byte or 'op code' that it is turned into at compiletime. Op codes can be anywhere from one to several bytes long, depending on the instruction. So after your compiler has converted all the instructions to op codes, it saves them to an executable file. Then when you run the executable, the op codes get sent to your CPU and run. Also, a ';' in MASM/TASM assembly indi- cates a comment follows and will be ignored by the compiler. Use comments to help yourself and others understand your code. Now to the instructions. -- ADD Destination Register, Register or Value This instruction is very simple to understand it does what it says. Adds two registers or a register and a value together. The result of the addition is placed in the destination register. For example: ADD eax, 10h ; add 10h to whatever was in eax, storing ; the value in eax. ADD eax, ebx ; add ebx to eax, storing the value in eax. -- MOV Destination Register or Memory Address, Register memory address or value This instruction moves data from one location to another. This can be from register to register, from numeric value into register or from mem- ory address into register and visa versa. Here are some examples. MOV eax, ebx ; move ebx into eax, overwriting it. MOV eax, 10h ; make eax = 10h, overwriting it. MOV [eax], 1000A000h ; place 1000A000h into memory address eax. MOV eax, [esi] ; take whatever is at memory address esi ; and put it into eax. -- CMP Register, Register or Value This instruction will test two registers or a register and a value without changing the data of either. When used in conjunction with a condi- tional jump, it is similar to an if( ) statement in a high level language. Most often, the ZF will either be set or unset. Consider these examples: CMP eax, 10h ; If eax = 10h, the ZF will be set CMP 10h, 10h ; ZF will be set because the two are equal CMP eax, ebx ; If eax = ebx, the ZF will be set Although other flags are set while running the CMP instruction, I will not detail them here. -- XOR Register, Register or Value This instruction performs an 'exclusive or' on the First register using the register or value you provide. This is most often used to clear registers but also used in simple encryptions. XORing performs operations as shown by the following table (From MSDN.) XOR EAX, EBX If bit in EAX is And bit in EBX is Then result is(EAX set to) 0 0 0 0 1 1 1 0 1 1 1 0 XOR eax, eax ; clear the register (set eax = 00000000h) XOR eax, 41h ; Xor eax with 41h, will turn eax into ; the result or the XORing -- JMP Destination This instruction and its family will cause the EIP to be set to the address specified. Hence, causing program control to 'jump' to another area in memory. A jump can either be relative to the EIP or definite. Most comp- ilers (TASM) make jumps relative to the EIP automatically, which is perfect for function hijacking. Consider the following code: ADD eax, 29ah MOV ebx, eax JMP exit ; Jump over the MOV edi, edi to the ADD MOV edi, edi ; This won't get run exit: ADD eax, 40h ; This will be run directly after the jump There is one thing that is very important here that I didn't expla- in. Labels can be used in assembly language instead of determining an addr- ess for the jump. In this example we are jumping to the 'exit' label. The jump family contains many conditional jumps as well. This means when a con- ditional jump is encountered a flag will be checked and then the jump will be executed if that flag is (or isn't set.) An example will prove easier: JNZ exit ; Jump if Zero Flag is NOT set JZ exit ; Jump if Zero Flag is set One thing to note is the JE (jump if equal) and JNE (jump if not equal) instructions. When the compiler programmers created the instruction mnemonics they decided to create multiple mnemonics that did the same thing just to increase readability. JNE and JNZ are the same. JZ and JE are the same and can be used interchangeably. I will not detail the entire jump fa- mily here. -- CALL Destination/RET The call instruction is very similar to the JMP instruction. The only difference is that CALL does one extra thing before jumping. The CALL instruction places the EIP of the next instruction onto the stack (see sta- ck section) when used. The CALL instruction is used in conjunction with the RET instruction. When the RET instruction is encountered, the data on the top of the stack is POP'ed jumped back to. You must be careful to keep the stack clean (in the state is was, directly after the CALL instruction) bef- ore executing a RET. Here's an example: CALL function ; JMP to function, push address of the ; 'XOR eax, eax' instruction to the stack XOR eax, eax ... function: MOV eax, 22h RET ; Jump to whatever is on the stack and pop ; it off (in this case the address of 'XOR ; eax, eax') The CALL/RET instructions are used to create 'functions' that can be used over and over again instead of rewriting the code many times (just like C.) -- PUSH Register or value The PUSH mnemonic puts data onto the stack and moves the ESP (stack pointer) to the data you put on. There isn't much to this. PUSH eax ; Put contents of eax onto stack PUSH 10h ; Put 10h onto stack -- POP Register The POP instruction pulls data off the stack and into the register specified. Also, the Stack Pointer is moved one value back. POP eax ; Pull data off stack into eax -- LODSD/LODSW/LODSB These are the ESI optimizations I was talking about above. All of these instructions have one thing in common. When used they read a certain ammount of data from memory address ESI and put it into EAX. The ammount read depends on the instruction. They are as follows: MOV esi, 00401000h ; This will be the memory address LODSD ; Read one DWORD from [ESI] into EAX LODSW ; Read one WORD from [ESI] into EAX LODSB ; Read one BYTE from [ESI] into EAX Very important: When a LODSD/W/B is called, the ESI register is in- cremented the ammount that was read. This is excelent for optimizing code. Those are all the instructions that I will cover here. Although th- ere are many more, these are just the basics to get you by. You can find lists of x86 assembly instructions on the internet. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Stack ;; ;; ;; ;; ;; The stack is an area where data is temporarily stored. The ESP reg- ister always points to the data on the sop of the stack. POP and PUSH are the most basic instructions (as shown above) that manipulate the stack. PUSH can "push" a value on the stack and POP can retrieve that value from the stack. You're probably saying "you just told me that" but I will illus- trate: ------------------ High Memory Address | Top of stack | |Data | |... | ESP-->|Top of the stack| |/\/\/\/\/\/\/\/\| Lower Memory address The stack starts high and grows downwards in memory. Try not to co- ncern yourself with that. Just know that when you push something onto the stack, the ESP points to it. When you POP something off the stack, the top value is pulled off. The stack is said to use a LIFO system (Last In First Out). Meaning when you place multiple things onto the stack, the last one you push on will be the first one you pop off. ------------------ High Memory Address | Top of stack | |Data | |... | ESP-->|Top of the stack| |/\/\/\/\/\/\/\/\| Lower Memory address PUSH 100h PUSH 29Ah ------------------ High Memory Address | Top of stack | |Data | |... | |Old top of stack| |100h | ESP-->|29Ah | |/\/\/\/\/\/\/\/\| Lower Memory address POP EAX ------------------ High Memory Address | Top of stack | |Data | |... | |Old top of stack| ESP-->|100h | |/\/\/\/\/\/\/\/\| Lower Memory address EAX now holds the value 29Ah. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Windows32 Calls ;; ;; ;; ;; ;; Now you're probably wondering "How do I do something Useful?!?!" Now, in the old DOS world, we would use interrupts (INT 21h) to do things (display text, write files, etc.) While, yes you CAN still use DOS interru- pts, they are obsolete. Meaning whatever you do with interrupts are emulat- ed in windows. Now, that's no good for hacking. Now days we use Windows API calls to do "useful things." With this, I will demonstrate the Win32 "Hello world" program. Remember I use TASM so this example is only guaranteed to work with MASM/TASM style compilers. ;---------------------------- CUT HERE ------------------------------------ .386 ; 386 processor .model flat ; Win32 memory model extern MessageBoxA:proc ; In order to use an API we have to extern ExitProcess:proc ; define it (import it) up here .data ourtext db 'y0',0 ; Null term string for text caption db 'XXX',0 ; Null term string for caption .code start: ; entry point push 0 ; Type of msgbox (don't care) push offset caption ; Address of caption string push offset ourtext ; Address of string to display push 0 ; hWnd owner (NULL, none) call MessageBoxA ; Call MessageBoxA push 0 ; Return Value call ExitProcess ; C ya end start ;---------------------------- STOP CUTTING -------------------------------- Now, how to assemble it: tasm32 /m3 /ml program.asm tlink32 /Tpe /aa program.obj,program.exe,,import32.lib pewrsec program.exe If all goes well you'll get a message box with the text "y0" and a caption of "XXX". You may not have the pewrsec program. I suggest you find it on the internet. When we get into relocable assembly, we will not be ab- le to use the data section, so everything (including data strings) will go into the code section. If we ever need to write to this area, we have to use this program to let us do that. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Conclusion ;; ;; ;; ;; ;; This brief introduction to win 32 assembly should be enough to get you started in writing your own assembly apps. There are plenty of resourc- es on the internet that discuss DOS assembly, most of which can be applied to win32 assembly coding. Some recommended texts are the "Billy Belceb? Vi- rus Writing Guide 1.00 for Win32" for intermediate and "The Overwriting Vi- rus by Horny Toad" for beginners. Good luck and have fun. BrandoVX