Sunday, 1 March 2026

The mechanics of Memory Analysis using Windows Performance Analyzer and Recorder (WPA/WPR)

When investigating the behaviour of a Windows system, I often use a combination of crash/memory dumps and Event Tracing for Windows (ETW): dumps to view the state and ETW to view the temporal development of a system. I often use WPR (and other trace controllers), WPA, cdb/WinDbg and homebrew tools to view and manipulate the captured data. One area that I had hitherto never explored was most of the “Memory” analysis graphs in WPA – especially the “Resident Set” and “Reference Set” graphs. 

My intention is to describe the mechanics of using ETW events to generate the “Resident Set” and “Reference Set” graphs rather than how to interpret the resulting graphs from a memory performance perspective.

Resident Set

The Resident Set graph, in particular, displays “state” information which I would normally have obtained in a list-like form from a dump file; it was the data manipulation abilities of WPA (grouping, sorting, collapsing, etc.) that made it interesting to me, but I wanted to know what data was used so that I could validate the interpretation of the raw data and compare the results with dump file analysis results (e.g. kernel debugger commands such as “!memusage” and “!vm”).

One way of creating an ETW trace for Resident Set analysis is to use the WPR profile “ResidentSet”; this profile uses the System Keywords: CpuConfig, DiskIO, HardFaults, Loader, Memory, MemoryInfo, ProcessThread, Session, VirtualAllocation, VAMap (keyword “Memory” also sets keyword “Filename” implicitly).

One further event provider is used to provide memory usage information (Win32HeapRanges) alongside a few other providers that are enabled to enhance the presentation of stack information.

By enabling only selected keywords or by filtering the ETW trace to remove information, one can investigate how the ETW trace data is used to create the Resident Set graph.

“Memory” is the only essential keyword; even if the other keywords are not used, a Resident Set graph will still be available in WPA, albeit with fewer “details”. When this keyword is enabled, a number of event types are recorded, the most important of which is named “Memory: PageInMemory” (in the Trace Statistics view of a trace). One event of this type will be logged for every physical page (4 kilobyte) of memory in the system; so 16 gigabytes of physical memory will result in over 4 million events emitted without delay, so many ETW buffers are needed to avoid losing events.

There is, by default, no MOF definition for this type in the WMI/WBEM database/repository, but the event data is essentially a MMPFN_IDENTITY structure, a definition of which can be found in the Windows Research Kernel (WRK); the WRK definition is old, but only a few minor tweaks (new bit field meanings and enumeration values) appear to have been made in the intervening years. The sequence of these events is essentially a parallel to the MMPFN array, which is the basis for the debugger “!memusage” analysis of memory usage.

The MMPFN_IDENTITY structure includes fields that identify a “list” (3 bits that encode: zero, free, standby, modified, modified-no-write, bad, active, transition) and a “use” (4 bits); these “use” values map to ResidentSetPageCategory values used by WPA as shown in the following table:

ResidentSetPageCategory

MMPFN_IDENTITY derived information

AddressingWindowExtensionsPage

MMPFNUSE_AWEPAGE

DriverLockedSystemPage

MMPFNUSE_DRIVERLOCKPAGE

Image

MMPFNUSE_FILE + MMPFN_IDENTITY.u2.e1.Image

KernelStack

MMPFNUSE_KERNELSTACK

LargePage

MMPFNUSE_LARGEPAGE

MapFile

MMPFNUSE_FILE + not MMPFN_IDENTITY.u2.e1.Image

MetaFile

MMPFNUSE_METAFILE

NonPagedPool

MMPFNUSE_NONPAGEDPOOL

PagedPool

MMPFNUSE_PAGEDPOOL

PageTable

MMPFNUSE_PAGETABLE

PageFileMappedSection

MMPFNUSE_PAGEFILEMAPPED

SessionPrivate

MMPFNUSE_SESSIONPRIVATE

StraggleIOPage

MMPFNLIST_TRANSITION

SystemPage

MMPFNUSE_SYSTEMPTE

VirtualAlloc_PreTrace

MMPFNUSE_PROCESSPRIVATE

WsMetaData

MMPFNUSE_WSMETADATA

 

The following ResidentSetPageCategory values reliably (in principle, perhaps not in implementation) combine MMPFN_IDENTITY event information with additional event types as follows:

ResidentSetPageCategory

MMPFN_IDENTITY plus derived information

WPR Keyword

CopyOnWriteImage

MMPFNUSE_PROCESSPRIVATE + MapFile Rundown

VAMap

CopyOnWriteMapFile

MMPFNUSE_PROCESSPRIVATE + MapFile Rundown

VAMap

CopyOnWritePageFileMappedSection

MMPFNUSE_PROCESSPRIVATE + MapFile Rundown

VAMap

VirtualAlloc

MMPFNUSE_PROCESSPRIVATE + VirtualAlloc Rundown

VirtualAllocation

Win32Heap

MMPFNUSE_PROCESSPRIVATE + HeapRange Rundown

Win32HeapRanges

SessionCopyOnWriteImage

MMPFNUSE_SESSIONPRIVATE + MapFile Rundown

VAMap

Driver

MMPFNUSE_SYSTEMPTE + Image Rundown

Loader

 

The following ResidentSetPageCategory values unreliably/incorrectly combine MMPFN_IDENTITY event information with additional event types as follows:

ResidentSetPageCategory

MMPFN_IDENTITY plus derived information

WPR Keyword

UserStack

MMPFNUSE_ PROCESSPRIVATE + Thread Rundown

ProcessThread

DriverFile

MMPFNUSE_FILE + FileName Rundown

Filename

Prefetcher

MMPFNUSE_FILE + FileName Rundown

Filename

RegistryFile

MMPFNUSE_METAFILE + FileName Rundown

Filename

 

The remaining ResidentSetPageCategory values are either not used or just not present in my system:

ResidentSetPageCategory

MMPFN_IDENTITY plus derived information

SystemCache

 

HyperSpace

 

 

The “current” implementation of the ResidentSetPageCategory classification for VirtualAlloc and Win32Heap just use “actual” events (rather than the “Rundown” events, which are also present) and so dramatically underestimates their values (attributing them to VirtualAlloc_PreTrace).

The UserStack classification just uses the StackBase and StackLimit values from Thread Rundown/Create events, which just gives an initial view of the stack virtual address range (the initial stack commit size) and does not take account of the stack reserve size. At the time of the WPR trace, the stack may have grown and this could be determined by combining information from the thread events and the VirtualAlloc Rundown events, but such calculations are not currently performed.

The DriverFile and Prefetcher classification just matches filename extensions against “.sys” and “.pf” – perhaps a reasonable heuristic but normally just an uninteresting subdivision of pages on the standby list.

The current RegistryFile classification is just pure nonsense. Firstly, it uses name matching on the filenames of “MetaFile” pages; in this context, “meta files” are file system metafiles such as $Mft, $LogFile, index files (directories), etc. (and, therefore, not data files such as registry hives). Secondly, it just looks for a few familiar hive names such as “SYSTEM”, “SECURITY” and “DEFAULT”. One could try to rescue this classification by using the “RegistryHive” keyword to obtain a rundown of hive filenames and matching those names against filenames of “MapFile” pages, but one would still have to allow for differences in the filenames (e.g. \Device\HarddiskVolume3\Windows\system32\config\SOFTWARE vs. \SystemRoot\System32\Config\SOFTWARE).

Page Frame Number (PFN) Pages

The ability to group pages based on list, use, process, file, page priority and pool tag means that there are few large counts of pages that cannot be broken down into smaller counts. One such large, opaque, block is the ResidentSetPageCategory SystemPage. There is however an event in a ResidentSet trace that can divide this block: “Memory: KeMemUsage”; this event contains the virtual address of the PFN database and its page count. The PFN database typically forms a large portion of the SystemPage category, so separating it from that category could be helpful. The “Memory: KeMemUsage” event actually contains a “UsageType” field, but currently only one usage type is defined:

ntoskrnl!_PERFINFO_KERNELMEMORY_USAGE_TYPE
    PerfInfoMemUsagePfnMetadata = 0n0
    PerfInfoMemUsageMax = 0n1

Bad Pages

Depending on the duration of the ResidentSet trace, there may be one or more “Memory: MemInfo” events in the trace. These events are the raw data for the WPA “Memory Utilization” graph; they contain interesting summary information (including standby list repurposed counts) and other counts, including the number of “bad” pages. The bad page count is useful because when examining the “Memory: PageInMemory” events, pages with a “list” value of “bad” are likely to be found. The “list” value is represented in 3 bits and all 8 possible values have established meanings but it is possible to overload the meaning of the bad list value. When “!memusage” examines the PFN database, it is able to use “magic” values in other fields of the MMPFN structure to overload the meaning of the “bad” list, but this distinction cannot be deduced from the contents of the MMPFN_IDENTITY structure. “!memusage” describes these pages as “SLIST/Temp”.

Combined Pages

I was not aware of “Combined Pages” until I started looking at this topic; they are described in “Windows Internals, Seventh Edition, Part 1” Chapter 5, Section “Memory combining”. Combined pages can be identified from information in the MMPFN_IDENTITY and WPA does this; when exploring the ResidentSetPageCategory PageFileMappedSection pages (PFMappedSection) one will probably find “CombinedPage” pages. The ResidentSet trace will include a “Memory: MMStat” event that includes statistics about page combining activity.

Non-Tradeable Pages

To be done.

Summary of ResidentSet Tracing and Analysis

The ability to share the raw data with others, store the raw data, combine information from several kernel data structures without groveling through undocumented data and use versatile user interface features to organize data make this a useful feature.

Reference Set

The WPR keywords used in a ReferenceSet trace do not differ greatly from those used in a ResidentSet trace; one keyword is omitted (DiskIO) and three additional keywords are added: FootPrint, MemInfoWS and ReferenceSet.

As far as I can tell, the FootPrint keyword just ensures that Memory, Pool and Session rundowns are included in the trace, but other keywords in the ReferenceSet trace trigger these rundowns too. MemInfoWS causes a “Memory: MemInfoExWS” event to be logged twice per second, containing summary information (total counts) for shared pages in each working set; these events do not seem to be used in the “Reference Set” graph.

As would be expected, the ReferenceSet keyword is essential for a reference set analysis. Its effect seems to be, in quick succession, to log a “start” mark, empty all working sets and to log a sequence of “Memory: InMemoryActive” events; it also logs a “stop” mark when the trace is stopped (but before the rundowns begin). Memory: InMemoryActive” events contain the same information as “Memory: PageInMemory” events (a MMPFN_IDENTITY structure) but are only logged for “active” list PFN database entries.

In addition to their rundown behaviour, the keywords Memory, VirtualAllocation and VAMap generate events whenever relevant operations occur (e.g. adding pages to working sets, mapping/unmapping files or pagefile, allocating/freeing virtual memory or pool); these events occur in a ResidentSet trace too, but they can be ignored for Resident Set analysis/graphing.

Column Names

More than 60 column names are available in the Reference Set View Editor. The names often give an indication of how a view “works” (e.g. which event types are used and how information from events is combined to present the view) and not being able to guess what a name means is an indication that one might not have fully understood the purpose of a view.

The first change to the view that I wanted to make was to switch from megabytes to pages as the measure of size; most of the potential column names for page count include the text “w/o Offer” and I had no idea what this might imply. Subsequently, I guessed that this text is related to Video memory offer and reclaim (there is also a column name of “VidMm”, which adds weight to the guess). There is an event provider that could possibly generate events relevant to this activity (Microsoft-Windows-DxgKrnl) but this provider is not included in the “Reference Set” recording profile and the topic is too far from my interests to pursue in more detail.

Some of the column names, such as “COFF Group” (known to me in the context of object/executable file formats) seem irrelevant with respect to event tracing; as would be expected, no values appeared under this (and similar) column names when they were added to a view.

Reaccess

The column name “Reaccess” combined with the potential values in the “Access Reason” (ReferenceSetReferenceReason) and Release Reason” (ReferenceSetReleaseReason) helped me to infer how the Reference Set view possibly works.

ReferenceSetReferenceReason

Related Event

PrivatePageAccess

Memory: PageAccess

SharablePageAccess

Memory: PageAccessEx

PageRangeAccess

Memory: PageRangeAccess

ActiveRundown

Memory: InMemoryActive

PoolAllocate

Pool: Allocate

PoolFree

Pool: Free

PageCombine

Memory: PageCombine

Reclaim

Microsoft-Windows-DxgKrnl

 

ReferenceSetReleaseReason

Related Event

PageRelease

Memory: PageRelease

PageRangeRelease

Memory: PageRangeRelease

VirtualAddressRangeEnd

Memory: VirtualFree (Flags includes MEM_RELEASE)

PageFileMappedSectionDelete

Section: Delete

PageCombine

Memory: PageCombine

VirtualAddressRangeDecommit

Memory: VirtualFree (Flags includes MEM_DECOMMIT)

PoolFree

Pool: Free

PoolAllocate

Pool: Allocate

ProcessEnd

Process: Delete

ThreadEnd

Thread: Delete

RemovedFromWorkingSet

Memory: RemoveFromWS

Offer

Microsoft-Windows-DxgKrnl

 

“Reaccess” seems to mean that a page has been “accessed” (added to a working set) twice without an intervening “release” (removal/eviction from a working set).

The ReferenceSetReferenceReason values cover all of the paths causing a page to be added to a working set plus other values: Reclaim (video memory) and Pool Allocate/Free.

Pool Allocate/Free is orthogonal to working set growth/reduction; my guess is that both “Allocate” and “Free” are included as “access” reasons and both as “release” reasons to “balance the books”. The default keywords used in a Reference Set trace do not include the “Pool” keyword; the “Pool” events in the trace are just the “large”/”big” allocations (size plus overhead greater/equal one page) and frees which are included via the “Memory” keyword.

The ReferenceSetReleaseReason values do not cover all of the paths causing a page to be removed from a working set (if one ignores RemovedFromWorkingSet, which is not enabled in the default capture) but does contain “other” values: Offer (video memory), Pool Allocate/Free (again) and ThreadEnd (presumably included to track user stack releases, but VirtualFree does this better). Missing from the list is the unmapping of files.

If one adds RemovedFromWorkingSet events to a trace (via the “WorkingSet” keyword), it is possible to completely eliminate “reaccess” occurrences; if one does not collect RemovedFromWorkingSet events but does use unmap file as a release reason, it is possible to reduce the number of “reaccess” occurrences to a handful (tens of events). On even just short traces, the existing WPA algorithm reports hundreds of thousands of such reaccess occurrences.

Enabling the “WorkingSet” keyword does mean that two additional high volume bursts of events are added to a Reference Set trace: another rundown of the active pages in the PFN database and all of the working set evictions that occur when the working sets are emptied at the start of the trace.

Verifying the accuracy of the working set tracking

A Reference Set trace contains a rundown of the active pages at the start of tracking (after working sets have been emptied) and a full rundown of the PFN database when tracking stops (when the trace is stopped). It is possible to compare the result of “initial state plus tracked page insertions/removals” against “final state” and the inconsistencies should be small; because it takes time to record the initial and final states, some insertions and removals that occur during the rundowns will not be attributed appropriately.

The elimination of “reaccess” occurrences is also a strong sign that the working set tracking is functioning correctly.

Special handling for some events

Some “Memory: PageAccessEx” indicate the conversion of a shared page to a process private page (e.g. when a page in a copy-on-write region is written to), and these events need to be detected if one is to accurately track working set state.

Some other “Memory: PageAccessEx” events indicate a ProcessId of 0; the stack at the time of such events typically looks like this:

MiLogMapFileEvent
 MiMapViewOfImageSection
 MiMapViewOfSection
 MiMapViewOfSectionExCommon
 MmMapViewOfSectionEx
 MiMapProcessExecutable
 MmInitializeProcessAddressSpace
 PspAllocateProcess
 NtCreateUserProcess

The page accesses are being recorded during the creation of the virtual address space of a new process; the ThreadId is that of the thread that called CreateProcess and can be used to “collect” the pages that are accessed. When the “Process: Create” event occurs, the attribution of the collected pages can be transferred to the new process.

The “Memory: PageAccessEx” events include a bit that indicates whether is page is/was in a user working set or a system working set; this user/system distinction must be observed when attributing a page.

Summary of ReferenceSet Tracing and Analysis

There seem to be serious flaws in how WPA attributes and tracks page ownership but the total impression given by the view is probably accurate enough for most practical purposes.