Inline Hooking NtCreateFile in User Mode
Setting up an x64 inline hook on NtCreateFile with a proper trampoline: overwriting bytes, building the jump-back, and thread safety considerations.
Inline Hooking NtCreateFile in User Mode
Inline hooking overwrites the first few bytes of a target function with a jump to your
detour. A trampoline preserves the original bytes and jumps back, so the real function
still executes. This post focuses on NtCreateFile — the syscall stub in ntdll that
every file-open ultimately reaches.
The Target: NtCreateFile in ntdll
On a stock Windows 11 x64 system, the first bytes of NtCreateFile look like:
; ntdll!NtCreateFile
4C 8B D1 mov r10, rcx
B8 55 00 00 00 mov eax, 55h ; syscall number (varies by build)
0F 05 syscall
C3 ret
We’ll install a 14-byte absolute jump before the syscall:
; 14-byte abs jump: mov rax, <addr64> ; jmp rax
48 B8 xx xx xx xx xx xx xx xx ; mov rax, imm64
FF E0 ; jmp rax
Building the Trampoline
The trampoline buffer must:
- Execute the 14 bytes we overwrote
- Jump back to
NtCreateFile + 14
#define HOOK_SIZE 14
typedef struct {
BYTE saved[HOOK_SIZE]; // copy of original bytes
BYTE jmp_back[14]; // abs jump to NtCreateFile+HOOK_SIZE
} TRAMPOLINE;
TRAMPOLINE* install_hook(PVOID target, PVOID detour) {
TRAMPOLINE* tramp = VirtualAlloc(
NULL, sizeof(TRAMPOLINE),
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE
);
memcpy(tramp->saved, target, HOOK_SIZE);
// Build jump-back: mov rax, (target+14); jmp rax
BYTE jmpback[14] = { 0x48,0xB8, 0,0,0,0,0,0,0,0, 0xFF,0xE0, 0x90,0x90 };
ULONG_PTR ret_addr = (ULONG_PTR)target + HOOK_SIZE;
memcpy(jmpback + 2, &ret_addr, 8);
memcpy(tramp->jmp_back, jmpback, sizeof(jmpback));
// Write the hook: mov rax, detour; jmp rax
BYTE hook[14] = { 0x48,0xB8, 0,0,0,0,0,0,0,0, 0xFF,0xE0, 0x90,0x90 };
ULONG_PTR det = (ULONG_PTR)detour;
memcpy(hook + 2, &det, 8);
DWORD old;
VirtualProtect(target, HOOK_SIZE, PAGE_EXECUTE_READWRITE, &old);
memcpy(target, hook, HOOK_SIZE);
VirtualProtect(target, HOOK_SIZE, old, &old);
FlushInstructionCache(GetCurrentProcess(), target, HOOK_SIZE);
return tramp;
}
Thread Safety
Writing 14 bytes is not atomic. A thread executing NtCreateFile concurrently
can read a torn instruction sequence and crash. In a single-threaded harness you
can ignore this. In a real injected DLL you have two main options:
- Suspend all threads before patching (
NtSuspendThreadon every thread except your own). Coarse but reliable. - Atomic 8-byte write — if the 8-byte jump displacement is naturally aligned, some architectures guarantee an atomic store. Guaranteeing alignment in mapped ntdll memory is non-trivial.
For a hook installed at DLL load time, before the process is fully multi-threaded, thread safety is rarely a practical concern.
The Detour
typedef NTSTATUS (NTAPI *NtCreateFile_t)(
PHANDLE, ACCESS_MASK, POBJECT_ATTRIBUTES,
PIO_STATUS_BLOCK, PLARGE_INTEGER, ULONG,
ULONG, ULONG, ULONG, PVOID, ULONG
);
NTSTATUS NTAPI NtCreateFile_hook(
PHANDLE FileHandle,
ACCESS_MASK DesiredAccess,
POBJECT_ATTRIBUTES ObjectAttributes,
PIO_STATUS_BLOCK IoStatusBlock,
PLARGE_INTEGER AllocationSize,
ULONG FileAttributes,
ULONG ShareAccess,
ULONG CreateDisposition,
ULONG CreateOptions,
PVOID EaBuffer,
ULONG EaLength)
{
if (ObjectAttributes && ObjectAttributes->ObjectName)
log_unicode_string(ObjectAttributes->ObjectName);
// Call original via trampoline (saved bytes + jump-back)
return ((NtCreateFile_t)g_tramp->saved)(
FileHandle, DesiredAccess, ObjectAttributes,
IoStatusBlock, AllocationSize, FileAttributes,
ShareAccess, CreateDisposition, CreateOptions,
EaBuffer, EaLength
);
}
Caveats
Control Flow Guard (CFG) does not block this approach — you’re modifying the function’s mapped bytes directly, not going through the CFG dispatch bitmap. CFG only guards indirect call/jump targets.
If you’d rather intercept at a higher level (e.g., CreateFileW in KernelBase.dll),
the identical technique applies and you avoid syscall stub churn between OS builds.