Sunday, September 28, 2014

ShellShock: Vulnerability, Patch and Exploit Analysis

On September 24th 2014, the World became aware of a very serious vulnerability (CVE-2014-6271 also known as the ShellShock vulnerability) in the widely used Bash shell. Bash is by far the most popular shell and is the default shell distributed with Linux and Mac OS X. In addition, it is used by many computers and servers that are exposed to the Internet. The patch was released on September 24th along with the announcement of the vulnerability, but attackers on the Internet relied on the fact that not everyone was going to promptly patch their computers and therefore started attacking vulnerable machines on the Internet.

The ShellShock vulnerability itself was a simple logic flaw in the Bash code rather than a memory corruption, making it very easy for a large audience to understand and exploit. To make matters worse, this vulnerability is “wormable” in that it does not require user interaction, allows untrusted/unauthenticated users to execute code on a remote machine and its exploit can be embedded into a malicious script to automate its proliferation around the Internet at an exponential rate. Some currently known attack vectors include (1) CGI-based web servers, SSH servers and DHCP clients, all of which are widely used services around the Internet.

The Vulnerability:
The vulnerability itself was the simple fact that Bash allowed the user to define functions and include multiple commands inside environment variables that are passed to Bash. This is dangerous in situations when the value of the environment variable passed to the new Bash instance is supplied by an unauthenticated and untrusted source.

The Fix:
The Fix for the ShellShock vulnerability was implemented in the patch “bash43-025” released on the GNU FTP website (2). Below is the relevant annotated code change inside Bash 4.3, which is the newest supported version of Bash.

//CVE-2014-6271: executed with “testbug = () { :;}; echo VULNERABLE”
void initialize_shell_variables (env, privmode)
  char **env;
  int privmode;
{
  ...
  /*
    Now, name = env variable name,
    string = env variable value
    char_index == strlen (name)

    temp_string = "testbug () { :;}; echo VULNERABLE"
    name = "testbug"
    SEVAL_FUNCDEF is OR'ed in below indicating that this was a
    function definition
    SEVAL_ONECMD is OR'ed in below to indicate that we only want
    a single command
  */
  if (legal_identifier (name))
    parse_and_execute (temp_string,
    name,
    SEVAL_NONINT|SEVAL_NOHIST|SEVAL_FUNCDEF|SEVAL_ONECMD);
  ...
}


int parse_and_execute (string, from_file, flags)
  char *string;
  const char *from_file;
int flags;
{
  ...
  while (*(bash_input.location.string))
  {
    ...
    /*
      1) SEVAL_FUNCDEF was OR'ed into line that called this function
      2) command->type == cm_connection != cm_function_def
      Therefore we execute inside the if statement, which produces an error.
      EX_BADUSAGE is eventually returned from this function.
    */
    if ((flags & SEVAL_FUNCDEF) && command->type != cm_function_def)
    {
      internal_warning ("%s: ignoring function definition attempt", from_file);
      should_jump_to_top_level = 0;
      last_result = last_command_exit_value = EX_BADUSAGE;
      break;
    }
    ...
    /*
      If we didn't go into the above if statement, and we only want one
      command, break out of the while loop now...
    */
    if (flags & SEVAL_ONECMD)
      break;
  }
  ...
}

The Exploit:
The exploit code (3) was also very simple. The below Bash script code triggers the vulnerability and prints “VULNERABLE” to the screen on vulnerable versions of Bash.

export testbug='() { :;}; echo VULNERABLE'
bash

The first line defines a new environment variable called testbug. This new environment variable contains just a function definition (no actual call to the function) followed by an additional command. The function definition is just “() { :;};” which is an unnamed Bash function which has the expression ':' inside of it. In Bash, when the ':' character is evaluated as an expression, it just evaluates to true (4). The function definition is followed by another command which in this case, just prints “VULNERABLE” to the screen. The second line executes a new instances of Bash which will consume this environment variable. This explicit execution of bash is analogous to a web server spawning a new instance of bash.

Other Events:
After the original patch was released for the original ShellShock vulnerability, the community started to find new variants of bugs in Bash and has since assigned multiple new CVEs to these bugs. As of the time of this writing, the discoveries are ongoing. Considering how long ago Bash was written, the old coding style, and the amount of string manipulation that is happening in C, it is very likely that new vulnerabilities (especially memory corruptions) will continue to be found.

Workarounds:
A possible way to avoid execution of vulnerable codepaths includes avoiding setting environment variables to untrusted user supplied input. This advice takes into consideration that new and currently unknown attack vectors and variants are likely to be discovered in the upcoming days and weeks.

References

Monday, June 30, 2014

Destroying ROP gadgets with Inline code

Prerequisite Reading:
Previous ROP (Return Oriented Programming) article
Traditionally in computer science, software developers using higher level languages and abstractions should not need to think about how the lower levels of the system works. For example, when writing a network application, one should ideally not need to worry about how the sequence numbers of the TCP protocol works. Two possible exceptions to this rule could be for security and performance. For security specifically, learning about instruction sequences emitted by compilers might help to avoid writing higher level (C/C++) code that could be used in ROP exploits.
Normal non-inline functions have a binary code layout where multiple callers execute x86 "call" instructions to redirect execution to the address of the single instance of the non-inline function code in memory. However, an inline function in C/C++ is function whose emitted code is inserted by the compiler directly into the possibly multiple call sites of that function throughout the program. An example follows:
#include <Windows.h>

LPVOID notInlined()
{
       return VirtualAlloc(NULL, 4096, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
}

__forceinline LPVOID inlined()
{
       return VirtualAlloc(NULL, 4096, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
}
 
void main()
{
       notInlined();//a call instruction will be placed here
       inlined();//the function’s code itself will be placed here
       //some additional code here
}
VirtualAlloc is a function that can be abused by ROP exploits to allocate Readable, Writeable and Executable memory. As shown in the C code above, the functions notInlined and inlined both call VirtualAlloc. Except for the __forceinline keyword in inlined, both notInlined and inlined are exactly identical in the C code. However, the binary code layout of each function looks very different.

notInlined disassembly:
push    ebp
mov     ebp, esp
push    40h
push    1000h
push    1000h
push    0
call    dword ptr[inlined!_imp__VirtualAlloc(0121b000)]
pop     ebp
ret
main function disassembly:   
push    ebp
mov    ebp, esp
call    inlined!ILT + 0(_notInlined)(011f1005)
push    40h  //this
push    1000h  //is
push    1000h  //code  
push    0  //of
call    dword ptr[inlined!_imp__VirtualAlloc(0121b000)]  //inlined
//some additional code here
xor     eax, eax
pop     ebp
ret
In the above disassembly, the code for notInlined is in its own function as we would expect, and can be executed and returned from, by an x86 “call” instruction from anywhere in the program. However, the disassembly for the inlined function (in red text) is placed inline in the main function (the call site).
The significance of the differing in-binary layouts of the two functions is that notInlined contains a very useful ROP gadget that ROP exploits can use, whereas the code for inlined does not contain the same ROP gadget. This difference is due to the fact that there is no x86 “ret” instruction in the code of inlined. If a ROP chain tried to execute inlined, if would be much more difficult to return from inlined back to the ROP chain.
In summary, the inline keyword can be used as an architecture, compiler, and OS portable way to destroy ROP gadgets in code where often abused APIs are called. The cost of inlining code however, is that it increases the code size in the binary. The reason for a larger code size in the example above is that if inlined was called from a large number of places in the program, the full code of inlined would be inserted in the binary that many times. As with all exploit mitigation schemes, there still might be ways to bypass this technique such as using jmp instructions rather than ret instructions to chain together gadgets.

References:

Sunday, April 27, 2014

VTGuard

Prerequisite Reading:
Previous “Attacking V-Table Pointers” article
 
The web browser is a war zone. We continue to see the latest and most cutting edge research, mitigation technologies, and exploitation techniques in popular web browsers such as Internet Explorer. One advanced mitigation technology in particular is VTGuard, a run-time security check introduced in Internet Explorer 10. VTGuard verifies VTable pointers before calling into them in an effort to mitigate Use-After-Free Exploitation.
VTGuard relies on a secret cookie which should not be known by the attacker who is redirecting the VTable call. This secret cookie varies with the ASLR load address of the DLL in which the object’s VTable is implemented. Although the actual check occurs dynamically at runtime, the checking code is emitted by the compiler at compile-time, implying that the original source code needs to be modified to take advantage of this mitigation.
Below is the disassembly of a VTGuard cookie check before a virtual function call:
mshtml!CElement::fireEvent+0x43:
mov     eax,dword ptr [ebx] //ebx is a pointer to our CElement object. Now eax has a pointer to the VTable.
cmp dword ptr [eax+308h],offset MSHTML!__vtguard (728d76ee) //we check a cookie at the end of the VTable before we trust it to be a true VTable
jne     MSHTML!CElement::fireEvent+0x189 (7284c30f) //if its not, bail
mov     ecx,ebx //else, store this * into ECX as per C++ thiscall convention
call    dword ptr [eax+150h] //call into the VTable pointer
 
Assuming object d is an instance of a class that inherits from class B1, below is a depiction of how object d would be laid out in memory, with VTGuard in place.
VTable with VTGuard in place

Due to the difficulty of finding and removing all Use-After-Free vulnerabilities, VTGuard has its place as a strategic mitigation for increasing the difficulty of exploiting such vulnerabilities. Even if the attacker is able to reallocate the heap hole in a Use-After-Free vulnerability and craft a fake VTable (see prerequisite reading), VTGuard would still need to be bypassed.

Sunday, February 2, 2014

Page Heap

A large class of vulnerabilities in software is memory corruptions due to buffer overflows and underflows. An example of this type of vulnerability is an attempt to write more data into a buffer than the size of the buffer itself (buffer overflow). Page Heap is a class of Heap Allocation Policies that can be used when diagnosing or fuzzing for new buffer mismanagement vulnerabilities.

The difficulty of dealing with buffer mismanagement vulnerabilities is that the memory corruption is often not observable at the time that it happens. In the best case, memory gets corrupted badly enough at the time of the corruption, such that the program crashes immediately. In the worst case, although the memory corruption occurred, memory doesn’t get corrupted badly enough at the time of corruption (a “silent” corruption) and therefore there is no immediate crash. Rather the effect of the corruption is observable through some indirect misbehavior of the program later on in execution. An example of this worst case scenario is that an unexpected variable might have its value changed, thereby leading to a different and unexpected code execution path later on. Page Heap is useful for forcing a “fail-fast” policy for memory safety. Page Heap aims to trigger the crash immediately when the memory corruption occurs.

How it Works

In the x86 architecture, Virtual Memory gives each process the appearance that it has 4 GB of address space, whether the memory pages are actually committed or not. When an address on a page that is reserved but not committed is dereferenced, a page fault is generated by the hardware. Page Heap exploits this fact by reserving a guard page at the beginning or end of a page to make sure that any adjacent reads or writes to a buffer cross a page boundary onto a page that is not committed, leading to an immediate page fault.

Guard Pages are Reserved rather than Committed. Using the !address command returns MEM_RESERVE for the State, and PAGE_NOACCESS for the Protection, thereby disallowing reading, writing or executing on the page without taking up any physical memory or space in the page file. Due to the fact that guard pages are inaccessible but are taking up Virtual Address space, Virtual Address space is wasted. For this reason, page heap is useful for debugging, but could lead to problems in a production environment.

For the 3 main types of Heap Policies below, the following code (compiled to PageHeap.exe) will be used as an example, and the “vulnerable line of code here” comment will be replaced by one of the multiple lines of code in each section to trigger the crash.


#include "stdafx.h"

class Buffer
{
public:
       Buffer()
       {
              //initialize our buffers
              memcpy(buffer1, "AAAAAAA", 8);
              memcpy(buffer2, "BBBBBBB", 8);
       }

       char buffer1[8];
       char buffer2[8];
};

void main(int argc, char* argv[])
{
       Buffer * myBuffer = new Buffer(); 
       //vulnerable line of code here

       delete myBuffer;
       while (true){}
}


Full Page Heap

Command: “gflags /p /enable PageHeap.exe /full”

Description:

This Heap Allocation Policy allocates buffers at the END of the memory page. A buffer OVERflow would attempt to access memory PAST the END of the page into the guard page, triggering an immediate page fault.

Vulnerable line of code:


memcpy(myBuffer->buffer2 - 9, myBuffer->buffer1, 8); //Buffer Underflow (write)
memcpy(myBuffer->buffer2 + 1, myBuffer->buffer1, 8); //Buffer Overflow (write)
memcpy(myBuffer->buffer2, myBuffer->buffer1 + 9, 8); //Buffer Overflow (read)


Allocation:
Full Page Heap-allocation at end of page with guard page following

Reverse Page Heap

Command: “gflags /p /enable PageHeap.exe /full /backwards”

Description:

This Heap Allocation Policy allocates buffers at the BEGINNING of the memory page. A buffer UNDERflow would attempt to access memory BEFORE the BEGINNING of the page into the guard page, triggering an immediate page fault. Due to the way Full Page Heap handles Buffer Underflow write, one might expect Reverse Page Heap to catch Buffer Overflow writes, but it does not.

Vulnerable line of code:


memcpy(myBuffer->buffer2, myBuffer->buffer1 - 1, 8); //Buffer Underflow (read)
memcpy(myBuffer->buffer2 - 9, myBuffer->buffer1, 8); //Buffer Underflow (write)


Allocation:
Reverse Page Heap-allocation at beginning of page with guard page preceding

Standard Page Heap

Command: “gflags /p /enable PageHeap.exe”

Description:

This Heap Allocation Policy tries to save Virtual Address space by avoiding allocating extra guard pages for each allocation. Rather, it adds a special pattern before and after each allocation. This allows it to detect Buffer Overflow writes and Buffer Underflow writes, but not Over/Under flow reads of any kind. One noteworthy point is that the integrity of the surrounding patterns are not checked until the buffer is freed. The significance of this fact is that it does not provide a perfect “fail-fast” scenario because a corruption only causes failure on free, not at the time of corruption.

Vulnerable line of code:


memcpy(myBuffer->buffer2 - 9, myBuffer->buffer1, 8); //Buffer Underflow (write)
memcpy(myBuffer->buffer2 + 1, myBuffer->buffer1, 8); //Buffer Overflow (write)


Allocation:
Standard Page Heap-allocation with pattern preceding and following


References:

http://msdn.microsoft.com/en-us/library/ms220938(v=vs.90).aspx

Saturday, November 16, 2013

Visual Heap Spray

Prerequisite Reading:
Previous “Low Fragmentation Heap ReAllocation for Use After Free Exploitation” article
Previous “Attacking V-Table Pointers” article
Heap Sprays are a common method attackers use to introduce determinism in a program’s address space. They aim to control a program’s memory layout in such a way that an attacker can reliably predict what will be in memory at a certain address (Address of Interest) at a certain point in execution.

For example, if there is a Use-After-Free bug, on an object with a V-Table, the object can be reallocated and the offset of the V-Table pointer in the object can point to an address that the attacker knows will contain the spray (Address of Interest). This knowledge often comes from trial and error when writing the exploit.

The Address of Interest makes a big difference in the quality of the exploit. For example, a very popular Address of Interest is 0x0c0c0c0c. The reasoning behind this Address of Interest is that the address must be low in the process’s address space (the highest Nibble of this address is 0x0), yet must be at a higher address in memory than the heap being sprayed (the second highest Nibble is 0xc) so that when the heap grows due to the memory pressure of the spray, it will grow into this address. Using high addresses such as 0xc0c0c0c (the highest Nibble is 0xc) would require that the application freezes for a longer period of time before the heap spray is complete. A victim that is being targeted might get bored and close the process (web browser in this case) due to the fact that it has appeared to freeze during the long time taken to spray, thereby precluding any possibility of successful exploitation.
Visualizations:
Below are some memory usage visualizations taken with the vmmap tool from SysInternals before and after the heap spray. The orange color represents the Backend Heap in the process’s address space. Two things to notice are the large growth in the orange segment of the graphs below and the difference in the “Committed” usage before and after the spray (it grows from about 136 MB to about 698 MB).
Before Spray:
 
Memory Usage before Heap Spray
 After Spray:
Memory Usage after Heap Spray
Below are graphical representations of the memory layout before and after the spray. The “After Spray” visualization has the approximate address of 0x0c0c0c0c marked for the reader’s convenience. One might make the argument that since 0x0c0c0c0c is relatively early in the heap spray, the heap spray could have been reduced to minimize the time the victim has to wait for the spray to finish.

Before Spray:
 
Memory Layout before Heap Spray

After Spray:
Memory Layout after Heap Spray

How to Heap Spray in Internet Explorer:

In IE, heap sprays are often done by allocating and assigning large strings from JavaScript. Sprays are often done on the Backend Heap (rather than the Low Fragmentation Heap). In order to get strings allocated on the Backend Heap, the strings must be larger than 16KB. Example JavaScript follows:

for (var z = primeAmount; z < numObjects; z++)
    objectArray[z].title = pattern;

The Heap Spraying technique does not come without some drawbacks, leading to some researchers referring to heap sprays as “For the 99%”. In some cases, exploitation can be made more reliable by finding multiple "good" bugs rather than heap spraying:
  • It might take a long time to spray (user might get impatient and terminate the program).
  • Depending on preexisting memory layout due to external factors (loaded plugins, other webpages visited prior to this one, etc), spraying can be unreliable.
  • Too much spraying might cause the Operating System to swap memory pages out to disk (depending on how much physical memory the victim’s machine has) and JavaScript Exceptions.
  • New IE mitigations might prevent highjacking virtual function calls.
  • There is no guarantee that the Address of Interest will contain the spray-an executable image or something else might be mapped at the Address of Interest, depending on the address space and system configuration unique to the victim. 
Libraries:
The community has done some great work to reduce the barrier of entry into this space. Multiple open source libraries have been written by researchers to abstract away the details of heap mechanics. In the example presented in this article, the heap reallocation/spray was done manually, but libraries such as HeapLib by Alex Sotirov and HeapLib2 by Chris Valasek allow users to just call into them in order to perform reallocation/sprays. Code review of HeapLib2 shows that this article, the prerequisite readings and HeapLib2 all use the same technique to reallocate and spray the heap.

Tuesday, October 8, 2013

Low Fragmentation Heap ReAllocation for Use After Free Exploitation

Use After Free (UAF) vulnerabilities occur when the application frees an object on the heap, and then tries to erroneously use it again (usually due to a stale pointer reference to the freed object). A common exploitation technique to target a UAF vulnerability is to try to reallocate heap memory between the time when the application frees the memory and when it erroneously dereferences the stale pointer to the freed memory. This reallocation would fill the “hole” left by the application when it initially freed the object. The typical timeline of this type of attack is as follows:

1) Application frees object on the heap
2) Attacker reallocates objects on the heap
3) Application erroneously dereferences freed pointer

Note that in normal circumstances, A UAF would crash the program due to an Access Violation. However, when the application is being exploited, the attacker’s reallocation serves 2 main purposes: 1) Make sure that there is data at the location pointed to by the stale pointer so the application doesn’t crash on erroneous dereference and 2) Craft the data in the reallocation in such a way that the dereference would help the attacker gain control over the Instruction Pointer (EIP on x86).
In Internet Explorer, exploits often use JavaScript to reallocate freed objects on a heap with the Windows Low Fragmentation Heap (LFH) policy activated and groom the backend heap in order to eventually gain control over the Instruction Pointer. The connection between JavaScript and freed objects is as follows:

"When string attributes of HTML objects are assigned in JavaScript, the strings are often stored in the same Front End LFH that the freed objects were stored in."

The significance of this statement is that attackers can craft maliciously formatted strings in JavaScript and have the strings fill holes left by the freed objects. In order to reallocate the holes left by the freed objects, the attacker must make sure the strings are of the same size as the freed objects, so that the stings will get allocated in the same LFH bucket as the freed object. Once the object of interest has been freed, and the attacker can assign those strings as attributes to an array of HTML nodes, and those strings are likely to be allocated on the same LFH as the freed object. Eventually, if this process is repeated enough, a reallocation of the “hole” left by the freed object would occur. This means that the next time the application dereferences the stale pointer (ie for a virtual function call), it would get the attacker’s string rather than the object it expects to be there. This would be how the attacker initially takes control of the Instruction Pointer. Below is an example of what this process may look like in JavaScript:

//create an array of HTML elements whose string attributes we will assign later
for (var i = 0 ; i < numObjects; i++)
    objectArray[i] = document.createElement('div');

/*
"prime" LFH by allocating a few strings of the same size as the object of interest to enable the LFH bucket for this allocation size
*/
for (var x = 0; x < primeAmount; x++)
    objectArray[x].title = pattern;

//application allocates object here

//application frees object here

//fill the hole left by the freed object assuming primeAmount < numObjects
for (var z = primeAmount; z < numObjects; z++)
    objectArray[z].title = pattern; //attributes should be allocated on LFH

//application erroneously uses object here

The JavaScript “pattern” string above was carefully crafted to meet the following criteria:

1) It must be the same size as the erroneously freed object, so that it will be reallocated in the same LFH Bucket.
2) Its value must be specifically crafted to help gain code execution. For example, if the freed object was a C++ object with a V-Table pointer, one of the first few DWORDs must point to a location in memory which the attacker controls (usually via a Heap Spray of the backend heap) that contains a malicious V-Table. For more information, see prior article about V-Tables.


For more information about the Low Fragmentation Heap, see:

Sunday, June 30, 2013

Heap Debugging Tricks

On Windows, the LFH (Low Fragmentation Heap) is commonly used to dynamically allocate small chunks of memory (<16KB in size). Many programs use the LFH as it is intended for high performance allocation/free of small objects, even in a multithreaded environment. Debugging memory corruptions on the heap can often be complex, but the following Windbg tricks may help:
  • !heap -p -a MEMORY_ADDRESS
This command is extremely useful when tracking down Use-After-Free vulnerabilities in applications. When run on a memory address, there are 2 relevant scenarios:

1) The memory address is inside a current allocation- Windbg displays the call stack that led to the allocation.


2) The memory address is inside an allocation that was freed- Windbg displays the call stack that led to the free.



Using this command requires the page heap to be enabled. Page heap can be enabled per application using the gflags.exe utility that is distributed in the Windbg package.


  • Caller Based Conditional Breakpoint

This Caller Based Conditional Breakpoint is very useful when one is interested in the place where the object gets allocated/freed, especially when the allocation/free codepath is very “hot” (meaning it is executed very often). The idea behind this breakpoint is to only break when there is a certain function (MemberFunction in the example below) on the call stack (ie only break if a certain function called this function directly or indirectly). If a breakpoint is placed on the constructor/destructor of the object’s class, and the object is allocated/freed very often, this could result in the breakpoint getting hit many more times that a human can reasonably look at. For this reason, placing a Caller Based Conditional Breakpoint in the constructor/destructor usually greatly reduces the number of times the breakpoint is hit, allowing a human to reasonably be able to investigate each time the breakpoint is hit. The Caller Based Conditional Breakpoint is of the following format:
bp Module!MyFunctionWithConditionalBreakpoint "r $t0 = 0;.foreach (v { k }) { .if ($spat(\"v\", \"*Module!ClassA:MemberFunction*\")) { r $t0 = 1;.break } }; .if($t0 = 0) { gc }"


  • Register Watching Breakpoint

The Register Watching Breakpoint is a slight variation on the Caller Based Conditional Breakpoint. In the Register Watching Breakpoint, instead of actually breaking when the appropriate caller is on the call stack, the command in Caller Based Conditional Breakpoint is modified to just print the CPU registers (using the ‘r’ command). Very often, in destructors, the address of the object that is being destroyed is in one of the CPU registers. The end result would be that each time the Register Watching Breakpoint is hit, the CPU registers will be printed rather than breaking execution. Since computers are very deterministic machines, if there is no entropy introduced in the program’s execution, one can count the number of times the breakpoint was hit before the object of interest was freed, and predict it the next time the program is run, and eventually get a live debugging session which is watching the object as it is being freed. The Register Watching Breakpoint is of the following format:

bp Module!MyFunctionWithConditionalBreakpoint "r $t0 = 0;.foreach (v { k }) { .if ($spat(\"v\", \"* Module!ClassA:MemberFunction *\")) { r; r $t0 = 1; gc; } }; .if($t0 = 0) { gc }"


References: