Saturday, June 30, 2018

Destroying ROP gadgets with Inline code

Prerequisite Reading:
Previous ROP (Return Oriented Programming) article
Traditionally in computer science, software developers using higher level languages and abstractions should not need to think about how the lower levels of the system works. For example, when writing a network application, one should ideally not need to worry about how the sequence numbers of the TCP protocol works. Two possible exceptions to this rule could be for security and performance. For security specifically, learning about instruction sequences emitted by compilers might help to avoid writing higher level (C/C++) code that could be used in ROP exploits.
Normal non-inline functions have a binary code layout where multiple callers execute x86 "call" instructions to redirect execution to the address of the single instance of the non-inline function code in memory. However, an inline function in C/C++ is function whose emitted code is inserted by the compiler directly into the possibly multiple call sites of that function throughout the program. An example follows:
#include <Windows.h>

LPVOID notInlined()
{
       return VirtualAlloc(NULL, 4096, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
}

__forceinline LPVOID inlined()
{
       return VirtualAlloc(NULL, 4096, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
}
 
void main()
{
       notInlined();//a call instruction will be placed here
       inlined();//the function’s code itself will be placed here
       //some additional code here
}
VirtualAlloc is a function that can be abused by ROP exploits to allocate Readable, Writeable and Executable memory. As shown in the C code above, the functions notInlined and inlined both call VirtualAlloc. Except for the __forceinline keyword in inlined, both notInlined and inlined are exactly identical in the C code. However, the binary code layout of each function looks very different.

notInlined disassembly:
push    ebp
mov     ebp, esp
push    40h
push    1000h
push    1000h
push    0
call    dword ptr[inlined!_imp__VirtualAlloc(0121b000)]
pop     ebp
ret
main function disassembly:   
push    ebp
mov    ebp, esp
call    inlined!ILT + 0(_notInlined)(011f1005)
push    40h  //this
push    1000h  //is
push    1000h  //code  
push    0  //of
call    dword ptr[inlined!_imp__VirtualAlloc(0121b000)]  //inlined
//some additional code here
xor     eax, eax
pop     ebp
ret
In the above disassembly, the code for notInlined is in its own function as we would expect, and can be executed and returned from, by an x86 “call” instruction from anywhere in the program. However, the disassembly for the inlined function (in red text) is placed inline in the main function (the call site).
The significance of the differing in-binary layouts of the two functions is that notInlined contains a very useful ROP gadget that ROP exploits can use, whereas the code for inlined does not contain the same ROP gadget. This difference is due to the fact that there is no x86 “ret” instruction in the code of inlined. If a ROP chain tried to execute inlined, if would be much more difficult to return from inlined back to the ROP chain.
In summary, the inline keyword can be used as an architecture, compiler, and OS portable way to destroy ROP gadgets in code where often abused APIs are called. The cost of inlining code however, is that it increases the code size in the binary. The reason for a larger code size in the example above is that if inlined was called from a large number of places in the program, the full code of inlined would be inserted in the binary that many times. As with all exploit mitigation schemes, there still might be ways to bypass this technique such as using jmp instructions rather than ret instructions to chain together gadgets.

References: