x86 : Function Calling Conventions
What are calling conventions?
Calling conventions are a standardized protocol for invoking subroutines (also called functions). Each hardware platform expects the compilers to follow certain rules while generating code regarding how arguments are passed to a subroutine, how the stack is setup, where to return to once the function/subroutine execution completes, who cleans up the stack etc. Such rules make up a  calling convention. So in short, calling convention specifies the method that a compiler sets up to access a subroutine. Different compilers can chose to generate code in different fashion, and in theory, code from any compiler can be interfaced together, so long as the functions all have the same calling conventions. 
While debugging, sometimes this knowledge becomes crucial to identify certain defects. For example, stack over flows might need us to know what calling convention was followed and thus how the over flow happened. I would encourage the reader to follow this article and the code examples given here thoroughly, since we will be using this knowledge in many of the future crash dump analysis scenarios.
Calling conventions specify how arguments are passed to a function, how return values are passed back out of a function, how the function is called, and how the function manages the stack and its stack frame. We will discuss all of these in greater detail shortly. In short, the calling convention specifies how a function call in C or C++ is converted into assembly language. 
There are many ways for this translation to occur, and each compiler can chose to do it slightly differently. Which is why it's so important to specify certain standard methods. If these standard conventions did not exist, it would be nearly impossible for programs created using different compilers to communicate and interact with one another.
Major calling conventions that are used with the C language are: 
- STDCALL
- CDECL
- FASTCALL
In addition, there is another calling convention typically used with C++: 
- THISCALL
Older programming languages used other conventions example: 
- PASCAL
- FORTRAN
These are not very widely used any more, and I would limit my discussion scope to the the C and C++ ones.
Argument passing
A function can have more than one arguments. The order of those arguments would matter for the execution of the function/subroutine. Example:
int foo(int a, int b);
A functions of this signature taking two arguments would need the assembly equivalent to pass these two arguments one after the other in the right order. If the arguments are passed in the left to right sequence then the assembly would look like :
push a
push b
call foo
however, if they are passed right to left, then the assembly would look like:
push b
push a
call foo
This is crucial, since assuming that foo is part of a library compiled with the assumption that arguments will be passed to it in the left to right order, and this library is being used by another linker and compiler pair, which assumes that the convention to be used is right to left, then the function foo will get the arguments in the incorrect order, which could be catastrophic.
Return Value
Some functions return a value, and that value must be received reliably by the function's caller. The called function places its return value in a place where the calling function can get it when execution returns. The called function stores the return value before executing the assembly ret instruction.
Caller - The Calling function
The "parent" function that calls the subroutine. Execution resumes in the calling function directly after the subroutine call, unless the program terminates inside the subroutine.
Callee - Called function
The "child" function that gets called by the "parent."
Stack Cleanup
Arguments passed to a function are done by putting them on the stack. This is called 'pushing'. When arguments are pushed onto the stack, eventually they must be removed from the stack, an operation called as 'popping' . Whichever function, the caller or the callee, is responsible for cleaning the stack must also reset the stack pointer to eliminate the passed arguments.
Name Decoration 
When C code is translated to assembly code, the compiler will often "decorate" the function name by adding extra information that the linker will use to find and link to the correct functions. For most calling conventions, the decoration is very simple (often only an extra symbol or two to denote the calling convention), but in some extreme cases (notably C++ THISCALL convention), the names are "mangled" severely. This is done because in C++ we have function overloading, and two functions with same names might actually have different implementations. Each compiler has separate mangling rules, and it us possible to identify the compiler used to generate the assembly by looking at the mangled names. To understand the different decoration methods, please refer to this article here.
Prologues - Entry sequence
The prologue is the few instructions at the beginning of a function, which prepare the stack and registers for use within the function.
Epilogues - Exit sequence
A set of a few instructions at the end of a function, which restore the stack and registers to the state expected by the caller, and return to the caller. Some calling conventions clean the stack in the exit sequence.
Call sequence
A few instructions in the middle of a function (the caller) which pass the arguments and call the called function. After the called function has returned, some calling conventions have one more instruction in the call sequence to clean the stack.
Let us now examine how the assembly looks when a function is defined using these different conventions. As mentioned earlier, we will only look into the C and C++ conventions here.
Standard C Calling Conventions
Note : All code examples and assembly equivalents are assuming it is x86 on 32 bit. Which means that pointer size is 4 bytes.
CDECL
This is the default calling convention for the C programming language. Other calling conventions are also used, but those are not part of standard ANSI C.
Assume this simple function below: 
_cdecl int foo(int a, int b)
{
    return (a+b);
}
Also lets assume that it was used in another function (caller) as follows:
nRet = foo(1, 2);
This will produce the following assembly listings (more or less).
_foo:
    push ebp
    mov ebp, esp
    mov eax, [ebp + 8]
    mov edx, [ebp + 12]
    add eax, edx 
    pop ebp
    ret
For the caller it will look like :
push 2
push 1
call _foo
add esp, 8
When translated to assembly code, CDECL functions are almost always prepended with an underscore (that's why all previous examples have used "_" in the assembly code). To learn more about decorations please refer to this article here.
So we see that in the CDECL calling convention the following holds:
- Arguments are passed on the stack in Right-to-Left order, and return values are passed in eax.
- Arguments are passed to functions by pushing them in the stack.
- The calling function cleans the stack. This allows CDECL functions to have variable-length argument lists (aka variadic functions). For this reason the number of arguments is not appended to the name of the function by the compiler, and the assembler and the linker are therefore unable to determine if an incorrect number of arguments is used. All variable argument functions like printf, main etc which use va_start(), va_arg() will use this convention.
We see that the caller adjusts the stack by adding 8 to ESP. This is because the there were two variables pushed into the stack. The following might make it easier to understand:
The EBP register in x86 has the following values:
EBP + 4 :  is return address
EBP + 8  : is address of first parameter 
EBP - 4  : is first local variable
That is why we see that in the above assembly, the values pushed into the stack are accessed by :
    mov eax, [ebp + 8]
    mov edx, [ebp + 12]
EBP (frame based model) points to the section of stack which will start with locals. Before EBP is stack setup stuff  like return address, variables etc.
Another register used for stack operations is the ESP. The ESP looks out for stack over flow (like a page fault) if EBP hits a page boundary it is a fault.
For the above function which has two variables as argument the stack looks like:
EBP  ->
       RET ADDR
       1
       2
ESP  ->
That brings us to the question, why do we not see the push for the return address in the stack when we do examine a disassembly? This is because when the call is setup, the CALL instruction is the one who pushes the return address and the EBP pointer there.
Now lets have a look at the same function and it's call when we use STDCALL instead.
STDCALL
STDCALL, also known as "WINAPI" (and a few other names, depending on where you are reading it) is used almost exclusively by Microsoft as the standard calling convention for the Win32 API. Since STDCALL is strictly defined by Microsoft, all compilers that implement it do it the same way.
Consider the function:
__stdcall foo(int a, int b)
{
    return (a+b);
}
...and it's called as:
nRet = foo(1, 2);
The corresponding assembly code might look similar to this:
:_foo@8
    push ebp
    mov ebp, esp
    mov eax, [ebp + 8] 
    mov edx, [ebp + 12]
    add eax, edx
   pop ebp
   ret 8
and the caller looks like:
push 2
push 1
call _foo@8
So we see that in the STDCALL calling convention the following holds:
- In the function body, the ret instruction has an (optional) argument that indicates how many bytes to pop off the stack when the function returns.
- Once again, like the CDECL, the arguments are passed by pushing them in the stack.
- STDCALL functions are name-decorated with a leading underscore, followed by an @, and then the number (in bytes) of arguments passed on the stack. This number will always be a multiple of 4, on a 32-bit aligned machine. More about name decorations can be found here.
FASTCALL
The FASTCALL calling convention is not completely standard across all compilers, so it should be used with caution. In FASTCALL, the first 2 or 3 32-bit (or smaller) arguments are passed in registers, with the most commonly used registers being edx, eax, and ecx. Additional arguments, or arguments larger than 4-bytes are passed on the stack, often in Right-to-Left order (similar to CDECL). The calling function most frequently is responsible for cleaning the stack, if needed.
Because of the ambiguities, it is recommended that FASTCALL be used only in situations with 1, 2, or 3 32-bit arguments, where speed is essential.
_fastcall int foo(int a, int b)
{
    return(a + b);
}
and the caller looks like :
nRet = foo(1, 2);
The assembly for this pair might looks similar to:
:@foo@8
    push ebp
    mov ebp, esp ;many compilers create a stack frame even if it isn't used
    add eax, edx ;a is in eax, b is in edx
    pop ebp
    ret
and the caller's code would look like :
mov eax, 1
mov edx, 2
call @foo@8
Many compilers still produce a stack frame for FASTCALL functions, especially in situations where the FASTCALL function itself calls another subroutine. However, if a FASTCALL function doesn't need a stack frame, optimizing compilers are free to omit it.
So we see that in the FASTCALL calling convention the following holds:
- In the function body, the ret instruction has an (optional) argument that indicates how many bytes to pop off the stack when the function returns.
- Unlike the CDECL or STDCALL, the arguments are passed through registers ecx and edx. Which means that they don't need to be copied into registers inside the function before being accessed, which is what we see in the case of CDECL or STDCALL. But what happens when there are more arguments than there are registers? Well, in that case after running out of registers, the rest of the arguments are infact pushed into the stack like the other calling conventions.
- The name decoration for FASTCALL prepends an @ to the function name, and follows the function name with @x, where x is the number (in bytes) of arguments passed to the function. Please refer to more details here.
Standard C++ Calling Conventions
THISCALL
C++ requires that non-static methods of a class be called by an instance of the class. Thus there must be ab mechanism in place to ensure that pointers to the object are passed to the function. In THISCALL, the pointer to the class object is passed in ecx, the arguments are passed Right-to-Left on the stack, and the return value is passed in eax.
Assuming we have a class bar with the non-static member function foo(). then a call to foo might look like:
barObj.foo(a, b);
ignoring name mangling for the moment (which is a default thing in C++), the function call would look like:
mov ecx, barObj
push b
push a
call _foo
To understand C++ name mangling please refer to this.
Here is an example of a C++ class with the above function defined, and how the name might get mangled.
class bar {
    foo(int a, int b);
}
bar::foo(1, 2)
The resultant mangled name might look like:
?foo@bar@@QAEHH@Z
As mentioned in this article here, actual debugging of a binary, without symbols, means that we won't get any help regarding the function calling convention type from it's name. Such symbols are stripped off the binary, unless public symbols for them are available. In such cases we would need to use the other hints shown above to understand the true nature of a call. We will cover such symbol less debugging in another article.
 
No comments:
Post a Comment