Format String Vulnerabilities are a class of vulnerabilities in which the programmer takes a format string specified by the user as input, and uses it in one of the many C Run Time(CRT) functions, such as printf and snprintf. Windows includes a lot of mitigations for this class of vulnerabilities, as we will see shortly.
When we call a function, the parameters are pushed onto the stack starting with the last parameter first. For example, if we call:
function(x, y, z);
The expected order of the push operations in the code emitted by the compiler would be semantically equivalent to the following:
A test program written in C calls _snprintf and receives the format string “%x %x %x” as the only argument on the commandline, which is pointed to by argv in the program. The call to _snprintf in our C code generates the following assembly code:
_snprintf(buffer, BUFFER_SIZE-1, formatString); //vulnerable function call
004116F9 mov eax,dword ptr [formatString]
004116FC push eax
004116FD push 7Fh
004116FF mov ecx,dword ptr [buffer]
00411702 push ecx
00411703 call dword ptr [__imp___snprintf (417244h)]
Using dynamic analysis, the values of the eax and ecx registers have been confirmed to be pointers to argv and buffer, respectively. The pattern in the above assembly code is to push the third argument(formatString), followed by the second argument (the constant 7F hex), followed by the first argument(buffer). This is in accordance with our expectations based on the above description of general function call mechanics.
The information disclosure issue occurs when the attacker-controlled format string (in this case, “%x %x %x”) instructs _snprintf to look for more arguments than we have actually passed to _snprintf. As you can see above, we don’t pass any arguments after the format string to _snprintf. _snprintf does not verify that we have passed the same number of arguments as the format string expects. Any additional arguments that were passed after the formatString parameter would have been pushed onto the call stack before the call to _snprintf. Since the arguments are pushed onto the stack in the reverse order that the function call specifies, the first value to print will be searched for 1 DWORD “under” the address of the format string in the call stack, which is 1 DWORD higher in memory than the address of the format string in the call stack (remember that the x86 call stack grows downwards in memory).My call stack at the point that we enter _snprintf looks like this:
The next line after the call to _snprintf in the test program was:
printf("buffer: %s\n", buffer);
and it outputted:
buffer: abababab cdcdcdcd efefefef
This output means that we are able to “spy” on the thread’s call stack. Using this technique, we can see other data that might have been stored on the call stack.
An advanced attack of this nature has been documented which would allow the attacker to change the program state rather than just spy on it. This attack involves using the %n sequence in the format string. In the context of format strings, %n means to write the number of bytes written so far, to the address in the parameter corresponding to the %n character sequence in the function call’s arguments. Since the attacker controls how many bytes are written, he controls the value that is written. A carefully crafted malicious format string can also control where this value is written to, depending on the number of %x sequences in the format string before the %n sequence. This ability can be leveraged to overwrite function pointers, returns addresses or other things.
In order to make this highly contrived example, many mitigations had to be bypassed. The mitigations bypassed were as follows:
1. Call the Windows _set_printf_count_output(); function and pass a parameter of 1
2. Disable ASLR in the Visual Studio compiler
3. No Buffer Security Check in the Visual Studio compiler
4. Disable Run-Time Error Checks (/RTC) in the Visual Studio compiler