Diagnosing and Resolving High CPU Usage Caused by Windows Kernel-Mode Driver Memory Leaks

Introduction

High CPU usage in Windows systems often stems from kernel-mode driver issues, particularly memory leaks that degrade system performance. This post explores a scenario where a malicious or defective driver causes excessive CPU consumption due to unbounded memory allocation, leading to kernel pool exhaustion and resource contention. The solution involves diagnosing the root cause using advanced tools and implementing fixes at the driver level.

Symptoms

  • Consistently high CPU usage (80%+) in the System or kernel-related processes
  • Recurring Blue Screen of Death (BSOD) with errors like “KERNEL_POOL_FAILURE” or “PAGE_FAULT_IN_NONPAGED_AREA”
  • Memory usage increasing over time despite no user applications being active
  • System instability, crashes, or responsiveness issues

Root Cause

Memory leaks in kernel-mode drivers occur when allocated memory is not properly released. This leads to kernel pool exhaustion, forcing the system to repeatedly allocate memory from the non-paged pool, which is critical for low-level operations. The resulting resource contention triggers excessive CPU usage as the kernel struggles to manage memory, and the BSODs indicate a failure to access valid memory regions.

Diagnosis Tools

  • Process Explorer: Identify high-CPU kernel-mode processes and inspect driver modules
  • Windows Debugger (WinDbg): Analyze memory dumps for stack traces and pool allocation patterns
  • Performance Monitor (PerfMon): Track kernel memory pool usage metrics (e.g., Pool Nonpaged Bytes)
  • Poolmon: Monitor kernel pool allocations and detect drivers with abnormal memory usage

Example Code

Consider the following flawed driver code that allocates memory without releasing it:

NTSTATUS DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING RegistryPath) { PVOID AllocatedBuffer; while (TRUE) { AllocatedBuffer = ExAllocatePoolWithTag(NonPagedPool, 0x1000, 'Leak'); if (!AllocatedBuffer) return STATUS_INSUFFICIENT_RESOURCES; // No ExFreePool call to release memory } return STATUS_SUCCESS;}

Step-by-Step Solution

  1. Identify the Culprit Driver: Use Process Explorer to check the “Image” column for drivers with high CPU usage. Right-click the driver and select “Properties” to confirm its name and path.
  2. Analyze Memory Dumps: Generate a memory dump using procdump -ma dumpfile.dmp. Load the dump in WinDbg and run !analyze -v to identify the driver responsible for the leak.
  3. Monitor Kernel Pools: Open PerfMon and add counters for “Pool Nonpaged Bytes” and “Pool Paged Bytes.” Correlate spikes in these metrics with the CPU usage.
  4. Detect Leaked Allocations: Use poolmon in an elevated command prompt. Filter by pool tag (e.g., poolmon -t 'Leak') to identify drivers allocating memory with the specified tag.
  5. Reproduce and Confirm: Implement a stress test to reproduce the leak. Monitor the system using Task Manager and WinDbg to verify the leak’s persistence.
  6. Fix the Driver Code: Modify the driver to include ExFreePoolWithTag after allocation. Example fix:

    NTSTATUS DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING RegistryPath) { PVOID AllocatedBuffer; while (TRUE) { AllocatedBuffer = ExAllocatePoolWithTag(NonPagedPool, 0x1000, 'Leak'); if (!AllocatedBuffer) return STATUS_INSUFFICIENT_RESOURCES; // Perform operations ExFreePoolWithTag(AllocatedBuffer, 'Leak'); } return STATUS_SUCCESS;}

  7. Re-deploy and Validate: Replace the driver with the fixed version. Monitor the system using PerfMon and poolmon to ensure memory usage stabilizes and CPU load decreases.
Scroll to Top