When someone is interested in code injection, he encounters Process Hollowing technic which consists in creating a remote process in a suspended state, write a payload in the remote process memory and overwrite the address of entry point with the address of the payload. A lot of articles on internet explain really well how the technique works and how to implement it in C/C++ using a PE as a payload.
However, all the articles about this technique lack one specific thing: handling the import table of the injected PE. When Local Reflective Execution is performed, it is just needed to iterate over the IAT and delayed IAT to import the needed libraries and resolve the required functions to fix the tables. The purpose of this blog post is to demonstrate how it is possble to fix the IAT and delayed IAT remotely when a PE is injected on a remote process.
This article does not show a new evasion technic but an improvement of an old technic used to inject PE in a remote process.
Basic Process Hollowing
This first section is a reminder of how to implement basic process hollowing with a PE without any IAT such as a meterpreter or Havoc payload. The article will not go into deep details about the basic process hollowing process since there are a lot of articles which explains better the technic. I suggest to read the article from ired.team about process hollowing if you want to have more details about the basics.
For people who already knows about the process hollowing, I suggest to directly jump to Make the remote process load the libraries required
chapter.
Definition
Process Hollowing is an injection technique that injects PE payloads into the address space of a remote process. The remote process is often a legitimate process created by the process hollowing implementation.
A typical process hollowing implementation generally creates a suspended process via the CreateProcess WinAPI and then calls NtUnmapViewOfSection to unmap the legitimate process image of the remote process. Once that’s done, NtMapViewOfSection is called to map the PE payload’s binary image instead.
However, in this article we won’t unmap the legitimate process image because:
when we will create remote threads, the legitimate process image is required
unmapping the principal image of a process create an IOC that is detected by most of the EDR
Start a suspended process
The first step is pretty straightforward. It is to create a process in a suspended state. The process needs to have the same architecture as the PE that we want to inject.(x64 PE on x64 process, x86 PE on x86 process, etc.). For the blog post the executable that will be used as the legitimate process will be svchost.exe.
To create the process, the WinAPI function CreateProcessA will be used. A little function, which will juste take as arguments, our process name that we want to execute and a pointer to a process information struct which will be initialiazed by the function CreateProcessA, will be created. The process information structure pi is used to retrieve the process handle and the main thread handle.
LoadPE and Retrieve NT Headers
For the article, a function to read a PE file from disk and to load it in a byte array is used. Alternatives can be done such as:
embed the PE as a byte array in our code
retrieve the PE remotely from a web server
To perform process hollowing, the injected PE NT Header is needed.
For those who are unfamiliar with PE format, it is suggested to read the really good serie of articles by 0xrick.
Here a simple function to retrieve the NT Header from the injected PE content.
Allocate Memory
Once the suspended process is created and the NT Header retrieved, we need to allocate memory on the remote process to store the payload.
The size of the injected PE image will be used to allocate memory.
Once the memory has been allocated, it is required to compute the offset between the allocation address and the preferred Image Base Address of the PE contained in the OptionalHeaders. This offset will be used to patch the binary during the relocation phase.
On most articles, the allocation is performed on the Image base Address of the legitimate process after beeing unmapped. However it has been preferred to not touch the original memory of process and let the operating system decide where the allocation will be made because, to load missing libraries of the injected PE into the remote process, it is needed to create remote threads. The process crashes when we attempt to create a remote thread when the remote process Image is unmapped. Therefore, it is needed to let untouched the original Image.
Copy PE in target process
Once the memory has been allocated, it is possible to copy our PE in the target process.
In a first time, it is required to update the ImageBase address in the NT Header with the address of the allocated memory. Once done, the injected PE headers will be copied in our newly allocated memory.
Then, by iterating over the section headers the content of the sections will be copied inside the allocated memory.
During the relocation phase, the .reloc section header will be needed, therefore the function that will copy the injected PE will return the section header.
Finaly, the function will change the permission on the .text section to make it executable.
Image base Relocation
Since the PE was loaded to a different address of the image base address referenced in the NT header, it needs to be patched in order for the binary to resolve addresses of different objects like static variables and other absolute addresses which otherwise would no longer work. The way the windows loader knows how to patch the images in memory is by referring to a relocation table residing in the binary.
The process of the relocation phase is:
finding the relocation table and cycling through the relocation blocks
getting the number of required relocations in each relocation block
reading bytes in the specified relocation addresses
applying delta (between source and destination imageBaseAddress) to the values specified in the relocation addresses
writing the new values at specified relocation addresses
repeating the above until the entire relocation table is traversed
Changing the entrypoint and resuming the execution
After the relocation phase done. The last step is to change the address of the register RCX of the remote process thread context with the address of the entrypoint of the injected PE. Also it is needed to change the address of the Image Base Address included in the PEB which is contained in the RDX register.
Once the thread resumed, we obtain our calc.exe. However, if we change the injected PE with a binary which has an IAT such as mimikatz, we can observe that the process crashes because it lacks the dependancies.
We now need to resolve the mimikatz IAT to be able to execute it without any crash.
Make the remote process load the required libraries
Load an arbitrary DLL in a remote process
Having established a basic process hollowing code, our objective is to enhance it to be able to load any PE. We will use the binary mimikatz as our injected PE, while maintaining the svchost binary as the remote process into which we intend to inject mimikatz.
The first step is a common technic used to make a remote process load an arbitrary DLL:
Allocate memory in the remote process
Write the name of the DLL inside the remote process in our newly allocated memory
Create a remote thread on LoadLibrary function with our DLL name as argument.
We can determine the address of LoadLibraryA, because every process on a Windows system has the same addresses for the libraries ntdll.dll and kernel32.dll which are automaticaly loaded. Since LoadLibraryA is declared in kernel32.dll, we only need to resolve the address of LoadLibraryA in our process and it will be the exact same address in the remote process.
Let’s try our function to make our svchost process load winhttp.dll for example.
When we look at the process launched in suspended state, we can observe that only the ntdll.dll is loaded.
But if we call our function, we will observe that winhttp.dll will be sucessfuly loaded.
The other loaded dll are the libraries needed by the legitimate svchost process.
Resolve injected PE IAT to make the remote process load all the dependencies
Now that we have a method to make the remote process load arbitrary DLLs, we now need to parse our injected PE to retrieve all its dependancies.
When a PE is loaded, there is a difference in addresses between the PE on the disk and the PE in memory. For example, when we copy our PE sections, we retrieve the section through the attribute PointerToRawData but the destination use the attribute VirtualAddress. When we open our binary in PE Bear, we can easily observe that there is a difference in the section mapping when it is on the disk and when it is loaded in memory.
Since the IAT is located in the .rdata section, if we retrieve it like we would have done when performing reflective loading, we won’t be able to get it since there is an offset between our PE read from the disk and the PE that is loaded in memory. Therefore, in a first time we will modify slightly our function copyPEinTargetProcess to be able to retrieve the .rdata offset between the PointerToRawData and the VirtualAddress.
The function now takes an additional argument that is a pointer to a DWORD to be able to retrieve the offset of rdata section.
Now let’s create a little function to test if we can resolve mimikatz IAT. To resolve it we need to get a pointer to the first import descriptor (do not forget to apply the .rdata offset when we compute the address) PIMAGE_IMPORT_DESCRIPTOR importDescriptor = (PIMAGE_IMPORT_DESCRIPTOR)((PBYTE)pImage + importsDirectory.VirtualAddress - offsetRdata );. And then we need to iterate until the structure is empty to retrieve all libraries in the IAT.
As observed, we can resolve mimikatz IAT. Now we can apply our function remoteLoadLibrary in the function to make our remote process load our dependancies.
Now let’s check if our process has successfuly loaded mimikatz dependencies.
We can observe that our process has loaded all mimikatz dependencies. Now let’s find a way to find the libraries base address in our code to be able to fix the IAT addresses.
Resolve the functions and libraries addresses on the remote process
Retrieve the libraries and function addresses
Now that we have our remote process with mimikatz dependancies loaded, we need to retrieve the address of the functions referenced in the IAT to be able to patch it. Otherwise, the pointers of the DLL imports will point to incorrect addresses.
To retrieve the loaded libraries in the remote process, we need to create a snapshot of our remote process using the function CreateToolhelp32Snapshot. The function will return a HANDLE on the snapshot on which we will be able to call the functions Module32FirstW and Module32NextW to retrieve the different libraries with the corresponding addresses. The functions return a MODULEENTRY32W structure used to represent the loaded library.
Let’s create a little function to enumerate the loaded libraries to determine if we can successfuly retrieve the corresponding addresses.
We can observe that we can retrieve the correct addresses for the loaded libraries.
Let’s create a function to create a snapshot of our remote process and another function to retrieve a module from its name and from a HANDLE of the remote process snapshot.
Like we would have done in a reflective loader, from the import descriptors retrieved previously, we will import locally the libraries needed by our injected PE. It will be used to retrieve the offset of our functions. Then we will iterate over all the thunks of the import descriptors. These thunks are data structures describing functions corresponding to the library imports.
The thunks can reference the corresponding function by its ordinal or by its name. Therefore, it is needed to apply the macro IMAGE_SNAP_BY_ORDINAL used to determine if the thunk reference the function through its ordinal or its name IMAGE_SNAP_BY_ORDINAL(thunk->u1.Ordinal).
If the function is referenced by ordinal, we can resolve the function by calling GetProcAddress to resolve the function address.
If the function is referenced by its name, we need to calculate the pointer to the name: PIMAGE_IMPORT_BY_NAME functionName = (PIMAGE_IMPORT_BY_NAME)((DWORD_PTR)pImage + thunk->u1.AddressOfData - offsetRdata);. Then, we can call the function GetProcAddress to resolve the function address.
Once we have the function address, we can calculate its offset in the corresponding library to be able to calculate its address in the remote process.
Now we need to find the thunk location on the remote process to write our patched address. We need to:
retrieve the address of the function address to patch &(thunk->u1.Function)
apply the .rdata offset on the address previously retrieved (PBYTE)(&(thunkFct->u1.Function)) + offsetRdata
substract the address of DLL locally loaded: (PBYTE)(&(thunk->u1.Function)) + offsetRdata - (PBYTE)pImage
finally add the address of memory allocation on the remote process: (PBYTE)(&(thunk->u1.Function)) + offsetRdata - (PBYTE)pImage + (PBYTE)allocAddrOnTarget
Now we have everything, we can just call the function WriteProcessMemory to patch the function address.
Now let’s wrap up everything and test if it is working.
If we put a debugger on our remote process, we can observe that when we resume the main thread, the process crashes with an access violation. If we look at the address where the access violation occurs, we can observe that it is related to the function LsaConnectUntrusted from the library Secur32.dll.
Let’s find out what happened.
Let’s write a little C code to perform D/Invoke on the function LsaConnectUntrusted.
We can observe that despite using a HANDLE on Secur32.dll, the address of LsaConnectUntrusted is located in the library sspicli.dll.
It is what we call a Forwarded Function. It is an exported function of Secur32.dll but which is forwarded to the library sspicli.dll.
Handle forwarded functions on remote process
Definition of a forwarded function
First let’s define what is a forwarded function.
In the context of dynamic-link libraries (DLLs), a forwarded function refers to a function that is not directly implemented within the DLL itself but is instead provided by another DLL. When a program calls a forwarded function in a DLL, the control is transferred to the corresponding function in another DLL.
The forwarding information is typically stored in the export table of the DLL. The export table contains a list of functions that the DLL makes available to other programs, and for forwarded functions, it includes a reference to the DLL and the specific function to which the call should be forwarded.
Here is a simplified example to illustrate how a forwarded function might be set up:
Original DLL (A.dll):
Implements some functions.
Has an export table that includes information about the functions it exports.
Forwarded DLL (B.dll):
Implements the forwarded function(s).
When A.dll exports a function that is forwarded to B.dll, the export table of A.dll contains information about the forwarding, specifying that the function is provided by B.dll.
Client Program:
Calls a function from A.dll, including the forwarded function.
When the forwarded function is called, control is transferred to B.dll, where the actual implementation resides.
Custom GetProcAddress
To be able to determine if a function is a forwarded function, we need to implement a custom GetProcAddress function which will return the forwarded library name and the forwarded function name if we are in the context of a forwarded function.
GetProcAddress function parses the loaded library passed in argument. First the function needs to retrieve the export directory of the library.
Once we have the export directory, we will retrieve 3 arrays:
an array containing the addresses of the exported functions
an array containing the ordinal of the exported functions
an array containing the names of the exported funtions
We now can iterate over the exported functions to retrieve the wanted function. Caution, the index of the function address is not the same as the index of its name. We need to use the ordinal as index.
Now that we have re-implemented GetProcAddress, we need our function to resolve the function when it is a forwarded one. To determine if the function is a forwarded function or not, we will observe if the function address is in the memory space of the export directory.
Once our condition passed, let’s look at the content of the address returned.
We can observe, that our mimikatz try to import the function SystemFunction007 from the library advapi32.dll. The function appears to be a forwarded function since it passed our condition. When we look at the address from the function addresses array, we can observe that it contains forwarded library name and the forwarded function name with the format FORWARDED_LIB.FORWARDED_NAME.
At this point, this is pretty straightforward, we need to copy the content of the forwarded name and the forwarded library name in the arguments forwardedLib and forwardedName that we have previously put in argument of our function.
And finally we can call LoadLibraryA on the forwarded library and our function recursively.
Ok now, we can replace all our GetProcAddress by our own function.
Our new function loadImportTableLibs will now looks like it.
And if we test it:
We observe that our forwarded libraries are correctly loaded.
Now let’s adapt our fixImports function.
Now let’s try this code to see if we can load mimikatz.
Our code stoped because it could not find the import api-ms-win-core-com-l1-1-0.dll.
If we look on top of our output to check which forwarded function we attempted to look for.
.
We can see that the function we attempt to patch in the IAT is CoInitializeEx which is supposed to be forwarded to the library api-ms-win-core-com-l1-1-0.dll.
Let’s look at it in a standalone code with a debugger.
As we can see, we loaded the api-ms-win-core-com-l1-1-0.dll library, however the debugger indicates us that it is in reality the library combase.dll.
It’s a mechanism created by Microsoft called the API Sets.
Handle API set
Definition of API Sets
API sets, also known as API set namespaces, are a concept introduced in Windows operating systems to help manage the evolution of the Windows API (Application Programming Interface) and provide a layer of abstraction for developers. API sets play a role in versioning and maintaining compatibility between different versions of Windows.
Windows implemented this in order to seperate functionalities through virtual names. It is also used to maintain compatibility across different Windows Versions.
You can find more details about it: Documentation Windows on API Sets
To sum up, API sets are names that are used as proxy for real DLLs. For our example api-ms-win-core-com-l1-1-0.dll is a proxy name for the dll combase.dll.
How to resolve API set names
When we look at the PEB structure referenced on Geoff Chappell website, we can observe that at the offset 0x68 we have a pointer to an attribute called ApiSetMap. This is where we can find the mapping of the API sets. However, when we look at the structure from winternl.h, we can see that the attribute is not referenced. By performing several tests and calculation, we can find that the ApiSetMap corresponds to the attribute: (PPEB)->Reserved9[0].
Once we retrieved the pointer to the ApiSetMap, we will need to cast it in a structure called API_SET_NAMESPACE.
With this structure we can calculate the address of the first namespace entry which is a API_SET_NAMESPACE_ENTRY.
Each namespace entry can have multiple entries. (Yes, a single api set can be a virtual name towards multiple DLLs). Each entry has the type PAPI_SET_VALUE_ENTRY in which we can find the corresponding dll name.
When we look at API set map, we found out that most of the API set names have only one corresponding DLL. After performing multiple tests, I realised that the edge case where we need to resolve the second DLL instead of the first was very rare. Therefore, to lighten our code we will take the first entry of the API set. However, keep in mind that you can encounter this edge case.
Also the last digit of the api set can differ from the one we are looking for. However, it is still the good resolution. For example: mimikatz has a forwarded function to the API Set name: api-ms-win-core-com-l1-1-0. However, when you enumerate your API Set Map, you will find out that the only similar API Set name is api-ms-win-core-com-l1-1-3. You will also notice that they resolve to the same DLL name. Therefore, when you resolve an API Set, it is advice to compare the name without the last digit.
Now let’s modify our function fixImports:
Let’s find out if our mimikatz successfuly works. To test, we will change slightly the function launchSuspendedProcess, to pass arguments to the command line. We will attempt to create a log file with mimikatz and execute the commands coffee and exit.
Now we have a fully functionnal code that allows us to execute any PE through process hollowing technic. But, we would like now to retrieve the output directly in our program.
Final Touch: Retrieve output of our injected process
Windows created pipes which is a mechanism used to create interprocess communication. Therefore, we can redirect stdOut and stdErr to the created anonymous pipe and then read it.
First we will modify our launchSuspendedProcess.
Note that we have created a function used to create a cmdLine by concatenating the process name and the arguments.
Now our function will create a pipe and redirect the output to it.
And then to retrieve the output we will create two functions. One to read from the pipe
And an other that will read fragments of the output until the remote thread finished
Let’s try it now with this main function
We finally have a fully PE runner in a remote process and we can retrieve the output.
Plot Twist
Plot Twist 1
Recently maldev academy published an update where they also perform process hollowing. However by reading it, I realized that if we copy our PE at its prefered image base address contained in its NT Header, we do not need to perform relocation nor IAT patching.
Plot Twist 2
After talking with snow (@never_unsealed). Thanks to him I found out that it was not even needed to copy our PE at its prefered image base address. It is just needed to update the Image Base Address in the PEB structure of the remote process. It is also needed to create the remote process with the flag CREATE_NEW_CONSOLE which spawns a child process conhost.exe (CREATE_SUSPENDED|CREATE_NEW_CONSOLE). After that the Windows loader will do everything for us.
Let’s put things into perspective
This technic allows to learn more about how the libraries are loaded in a process (Forwarded Functions / API Sets / Functions and libraries resolution on a remote process).
Also, by using this technic:
you can only copy the PE sections without the headers (header stomping).
you can avoid memory pages overlap by not forcing the address of the allocation.
you can retrieve the output without the remote process creating a child process “conhost.exe”. (you can create your suspended process with only the flag CREATE_SUSPENDED)
Hope you enjoyed it and learned something in this too long blog post.
References
ired.team that allowed me to learn about basic process hollowing
maldev academy that allowed me to learn about API Set names
Havoc source code that allowed me to learn more about forwarded functions
0xrick blog that allowed me to learn more about PE format
Geoff Chappell website that allowed me to have a better understanding about Windows internal structures