At the heart of MWR InfoSecurity lies our research and development platform - MWR Labs. Here we dissect industry news and trends, publish research, and share our tools with the security community
+ read more
Hooking is the process of intercepting a program’s execution at a specific point in order to take another action.
This article is the first of two providing a basic overview of a number of dynamic hooking techniques. Part 1 covers techniques that can be used in user mode and part 2 will cover techniques that work in kernel mode.
Hooking is the process of intercepting a program’s execution at a specific point in order to take another action. This may be simply to trace execution at interesting points, or to redirect and modify execution. At the point at which you wish to intercept, you place your hook.
A hook can be placed by modifying a program’s code on disk or even building it in at compile time, but this article focuses on dynamic hooking, where hooks are placed at runtime in memory. This allows hooks to be applied to any software whether we have the source code or not and whether we are able to modify the program’s files or not.
This article assumes that you have sufficient access to modify a program’s memory to place a hook. Practically speaking, this could be through attaching a debugger, injecting code into a process’s memory space or by having compromised an application by exploiting a vulnerability.
Hooking can be useful for a wide range of purposes, including:
The simplest way to visualise a hook is to consider a simple example. If you wish to hook a specific call of a function in a program, the target address of that call instruction can be changed to point to your own code which you have somehow injected into the program’s memory space. You can then choose whether to handle the call completely yourself, possibly providing a fake result, or you may choose to redirect execution to the original function, perhaps also changing the arguments or altering the return value.
So, while originally the call would look like this:
... ... call func1 ... ... func1 ... return
It is altered to look like this:
... ... call hook ... ... hook ... optional call to original ... return func1 ... return
If you wanted to use the same hook function for multiple hooks, you could create a lookup table of return addresses and check this at the start of the hook to work out where the hook call originated, and then take a different action depending on this information. Of course this assumes you are not on a platform with technologies like ASLR enabled, which may randomise addresses making this simplistic method impossible. Nevertheless, this simple example demonstrates what a hook is.
The Import Address Table (IAT) is loaded into memory from PE executables. It allows the memory address of functions imported from DLLs to be located. By locating the IAT in memory, you can patch it to redirect certain API function calls to hooking code.
IAT hooking happens in user mode and is a relatively easy way to hook every call to a specific APIfunction or set of functions. Any time the program makes that API call, your hook code will run. However, DLLs can also be loaded dynamically at runtime and when this happens there will be noIAT for that DLL. Furthermore, while the fact that this is a user mode technique makes it easier to implement, it also makes it easier to detect. As such, IAT hooking is not without its limitations.
Under Unix-like operating systems (including Linux), calls to shared libraries can also be hooked by making use of the LD_PRELOAD environment variable. By writing your own shared library with the functions you wish to hook defined by name you can hook these functions by loading that library at runtime with the target program. The LD_PRELOAD environment variable specifies a list of libraries to load first when a program is executed, so if you put the path to your library in this variable your own function will run instead.
If you also wish to redirect execution to the original function you are hooking, you can do so with the dlsym() function call which resolves a function name in a module to a memory address. Using this, the address of the original function can be located and used to make a call to it.
An inline hook overwrites the start of the function you want to hook to redirect execution. This allows you easily to catch every call to that function, no matter where or when the call happens. Inserting an inline hook obviously destroys some of the early logic of the original function, so if you wish to call the original function as well, the hook code must compensate for the instructions that were overwritten. An inline hooker should therefore save the instructions that were overwritten when placing a hook. This may not be trivial to do on architectures like x86 where instructions are variable length, and saving the bytes that were overwritten may not preserve the meaning of the code. In such cases, automated inline hookers may need to disassemble the start of the function properly to preserve the original function’s meaning.
Inline hooking can be used in user mode, although similar techniques can also work in kernel mode.
Microsoft released a framework to help in placing inline hooks on Win32 API functions called Detours. Microsoft actually places a two-byte dummy instruction that does nothing at the start of functions (the instruction is “MOV EDI, EDI”) to allow space to overwrite with a jump instruction harmlessly. The Detours package provides an API to enable custom hooks to be placed on APIfunctions in this way.
In fact, the dummy instruction is only big enough to replace with a short jump. A long jump (to code further away) would take up too many bytes and still overwrite part of the function. The intention is that a short jump is placed to jump to five bytes before the function, which is set aside as spare padding space. In these five bytes, the long jump can be placed to redirect to your hook code.
The advantage of using the short jump first is that whether in its “MOV EDI, EDI” form or its short jump form, those two bytes are always a single instruction. This means that, if multiple threads are executing the function as you are hooking it, a thread will never end up executing from the middle of an instruction you just inserted for your hook; rather it will either hit your new short jump, or already be beyond it, safely executing the rest of the function.