Monday, January 2, 2012

Software Attack Vectors: Stack Based Buffer Overflow

Buffer Overflows are a commonly used Attack Vector to highjack the target machine's execution path. Before we can discuss how buffer overflow attacks work, we must understand how the function call stack works. I will assume we are using an x86 machine, and without mitigating factors such as DEP(Data Execution Prevention), ASLR(Address Space Layout Randomization), PAX, etc. The general stack manipulation algorithm when calling a function is as follows:
  1. Push all function arguments
  2. Push the address of the first instruction after this function call(the next line to be executed)
  3. Push the current value of x86 EBP register
  4. Allocate space for any local variables
Normal call stack operation


Since the stack grows down from higher memory to lower memory, the top of the stack would have an address that’s less than that of the bottom of the stack. Any pushes and space allocations on the stack would decrement the value of the ESP register. Also note that functions like strcpy() copy data starting at a destination address, towards higher addresses. This poses a risk if the input data is too large--it will overwrite the call stack above the destination buffer in memory. For example, we have the following code inside a function:

char destination[8];

char source[16] = {0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x00};

strcpy(destination, source);
 
Since strcpy() copies data from source to destination until it sees a null terminator character, the danger is that an attacker can supply a source buffer that has more bytes between the first character and the first null terminator characters(inclusive) than the number of bytes we allocated for the destination buffer on the call stack. In the above example, the destination buffer only has space for 8 bytes, whereas the source buffer has 16 bytes between the first character and the null terminator character. Strcpy() is one of many unsafe functions due to its lack of bound checking(http://msdn.microsoft.com/en-us/library/bb288454.aspx).




 
As we can see, the "Return address" value on the stack has been overwritten with "0d0e0f00". When the function is returned from, the value of the stack location where "Return address" was stored is read into the CPU's EIP(instruction pointer) register. Since we are on an x86 machine, this address is read in little endian ordering, resulting in the EIP register containing the value "000f0e0d". This means that the next instruction that the CPU will attempt to execute would be at the address "0x000f0e0d" in memory. Remember, the attacker controls what values they put into the source buffer they passed, resulting in an address of their choosing in EIP at this point. The execution path has officially been highjacked by the attacker: GAME OVER.