Format String Vulnerabilities are
a class of vulnerabilities in which the programmer takes a format string specified
by the user as input, and uses it in one of the many C Run Time(CRT) functions,
such as printf and snprintf. Windows includes a lot of mitigations for this
class of vulnerabilities, as we will see shortly.
When we call a function, the
parameters are pushed onto the stack starting with the last parameter first.
For example, if we call:
function(x, y, z);
The expected order of the push
operations in the code emitted by the compiler would be semantically equivalent
to the following:
push z
push y
push x
A test program written in C calls
_snprintf and receives the format string “%x %x %x” as the only argument on the
commandline, which is pointed to by argv[1] in the program. The call to
_snprintf in our C code generates the following assembly code:
_snprintf(buffer, BUFFER_SIZE-1,
formatString); //vulnerable function
call
004116F9
mov eax,dword ptr
[formatString]
004116FC
push eax
004116FD
push 7Fh
004116FF
mov ecx,dword ptr
[buffer]
00411702
push ecx
00411703
call dword ptr
[__imp___snprintf (417244h)]
Using dynamic analysis, the
values of the eax and ecx registers have been confirmed to be pointers to
argv[1] and buffer, respectively. The pattern in the above assembly code is to
push the third argument(formatString), followed by the second argument (the constant
7F hex), followed by the first argument(buffer). This is in accordance with our
expectations based on the above description of general function call mechanics.
The information disclosure issue
occurs when the attacker-controlled format string (in this case, “%x %x %x”)
instructs _snprintf to look for more arguments than we have actually passed to
_snprintf. As you can see above, we don’t pass any arguments after the format
string to _snprintf. _snprintf does not verify that we have passed the same number
of arguments as the format string expects. Any additional arguments that were
passed after the formatString parameter would have been pushed onto the call
stack before the call to _snprintf. Since the arguments are pushed onto the
stack in the reverse order that the function call specifies, the first value to
print will be searched for 1 DWORD “under” the address of the format string in
the call stack, which is 1 DWORD higher in memory than the address of the format
string in the call stack (remember that the x86 call stack grows downwards in
memory).
My
call stack at the point that we enter _snprintf looks like this:
The next line after the call to _snprintf in the
test program was:
printf("buffer:
%s\n", buffer);
and it outputted:
buffer: abababab cdcdcdcd efefefef
This output means that we are
able to “spy” on the thread’s call stack. Using this technique, we can see
other data that might have been stored on the call stack.
An advanced attack of this nature
has been documented which would allow the attacker to change the program state
rather than just spy on it. This attack involves using the %n sequence in the
format string. In the context of format strings, %n means to write the number
of bytes written so far, to the address in the parameter corresponding to the
%n character sequence in the function call’s arguments. Since the attacker
controls how many bytes are written, he controls the value that is written. A
carefully crafted malicious format string can also control where this value is
written to, depending on the number of %x sequences in the format string before
the %n sequence. This ability can be leveraged to overwrite function pointers,
returns addresses or other things.
In order to make this highly
contrived example, many mitigations had to be bypassed. The mitigations bypassed
were as follows:
1. Call
the Windows _set_printf_count_output(); function and pass a parameter of 1
2. Disable
ASLR in the Visual Studio compiler
3. No
Buffer Security Check in the Visual Studio compiler
4. Disable
Run-Time Error Checks (/RTC) in the Visual Studio compiler