Monday, 22 December 2025

Heizkurve / Heizkennlinie Formel

Ich habe kürzlich im Internet nach einer Kombination der Begriffe „Heizkurve“ und „Formel“ gesucht und war von den Ergebnissen enttäuscht – es gab zwar viele Treffer, aber keiner entsprach meinen Erwartungen. Die Suche war ein iterativer Prozess, bei dem ich die Suchbegriffe verfeinert habe (z. B. durch Hinzufügen des Synonyms „Heizkennlinie“). Die besten Ergebnisse deuteten auf eine Formel wie die folgende hin:

TV=T+Sk(TTA)mT_{V}=T+S\cdot{k}\cdot\left(T-T_{A}\right)^{m}

Einige Suchergebnisse verwendeten Kurvenanpassungsverfahren (z.B. Methode der kleinsten Quadrate), um Werte für k und m anhand von Daten aus gedruckten Diagrammen der Heizkurve zu berechnen. Der Zweck von „m“ wurde hinreichend beschrieben (es beschreibt näherungsweise die Nichtlinearität des Zusammenhangs zwischen der Temperaturdifferenz eines Heizkörpers (Bodenheizung, Radiator) und seiner Umgebung und der vom Heizkörper an die Umgebung abgegebenen Energiemenge), doch die Herkunft des Wertes (beispielsweise 0,83) blieb unklar. „k“ war ein noch größeres Rätsel – sein Wert (beispielsweise 1,83) wurde lediglich benötigt, um die vorgeschlagene Formel an die beobachteten Diagramme anzupassen.

Für die meisten praktischen Zwecke sind die Diagramme völlig ausreichend – das Interesse an einer Formel ist hauptsächlich rein intellektueller Natur.

Mir war bekannt, dass es Dokumentationen (Handbücher) für meinen Heizungsregler (Hoval TopTronic, EbV THETA) gibt und dass der Heizungsregler verschiedene Zugriffsebenen für Informationen und Einstellungen bietet (Benutzer, Experte, Hersteller (OEM, Original Equipment Manufacturer)). Durch sorgfältiges Lesen dieser Quellen und Anwendung des Ockhamschen Rasiermessers habe ich die möglichen Auswirkungen verschiedener Einstellungen untersucht.

HEIZKURVE [USER] → S
RAUM TAG [USER]
→ TS
Grenztemperatur für die Sommerabschaltung [SYSTEM 4]

Klimazone [SYSTEM 09]
→ TZ
Gebäudeart [SYSTEM 10]
→ TA Mittelung
Heizsystem (Exponent) [MISCHER 02]
→ m
Raumfaktor [MISCHER 04]
→ F
Adaption der Heizkurve [MISCHER 05]

Maximaltemperaturbegrenzung [MISCHER 13]

Temperaturüberhöhung [MISCHER 14]
→ TÜ

Daraus ergab sich folgende Formel:

TV=T+S(TTA)1m(TTZ)m1m+TUT_{V}=T+S\cdot\left(T-T_{A}\right)^{\frac{1}{m}}\cdot\left(T-T_{Z}\right)^{\frac{m-1}{m}}+T_{U}

anfänglich mit T = TS (wird später verfeinert) und wo TV = Vorlauftemperatur, TS = Raum Solltemperatur, TA = Außentemperatur, TZ = Klimazone, TÜ = Temperaturüberhöhung, S = Steilheit, m = Heizsystem (Exponent), F = Raumfaktor.

Die erste zu berücksichtigende Einstellung ist „Heizsystem (Exponent)“ (Parameter 02 in den MISCHER-Einstellungen auf Expertenebene). In manchen Systemen dient dies lediglich als Kennzeichnung für den Heizsystemtyp (z. B. 1 → Fußbodenheizung, 2 → Radiatorenheizung usw.), in anderen ist es eine Zahl zwischen 1,0 und 10,0 (mit empfohlenen Werten wie 1,1 für Fußbodenheizung und 1,3 für Radiatorenheizung). Ich bin der Ansicht, dass der Kehrwert dieses Wertes der „Exponent“ in der Heizkurvenformel ist (der Kehrwert von 0,83 ist 1,2). Dieser Wert wird oft als „m-Wert“ bezeichnet, daher verwende ich in der Formel den Buchstaben „m“.

Eine Möglichkeit zur Neuskalierung des Temperaturdifferenzwertes besteht darin, das Verhältnis zwischen der größten Temperaturdifferenz und ihrem Exponenten zu berechnen. Der maximale Temperaturwert ist TS – die Raum-Solltemperatur; die minimale Temperatur ist wahrscheinlich der Wert des Parameters Klimazone (Parameter 09 in den SYSTEM-Einstellungen auf Expertenebene).

Je nach Heizungsregler ist der Parameter Klimazone entweder ein Index aus einer Liste nummerierter Klimazonen oder eine Temperatur zwischen -20 °C und 0 °C, die die niedrigste zu erwartende Außentemperatur darstellt (Websites wie Klimakarte | BWP | waermepumpe.de liefern solche Schätzwerte – für Grenzach-Wyhlen beträgt der Wert -10,1 °C). Ich verwende TZ zur Darstellung dieses Wertes. Der folgende Term beschreibt die notwendige Reskalierung, um die ursprüngliche Erwartung wiederherzustellen:

(TTZ)m1m(T-T_Z)^{\frac{m-1}{m}}

Es gibt außerdem Experteneinstellungen, die es ermöglichen, die tatsächliche Raumtemperatur in die Berechnung der Vorlauftemperatur einzubeziehen: Raumaufschaltung und Raumfaktor. Es gibt einige Faktoren, die die Nutzung dieser Funktion unratsam machen können (z. B. wenn die Positionierung des Raumfühlers nicht geeignet ist, die Temperatur im Gebäude zu messen). Ist sie jedoch aktiviert, kann sie als Korrektur der Raum-Solltemperatur in die Heizkurvenformel integriert werden. Ich verwende TI für die Raum-Isttemperatur und F für den Raumfaktor als Multiplikator (d. h. Raumfaktor / 100, da er als Prozentsatz angegeben wird). T in der Formel kann nun wie folgt interpretiert werden:

T=TS+(TSTI)FT=T_S+(T_S-T_I)∙F

Mithilfe der Webanwendung Desmos lassen sich die Parameter anpassen, um die Heizkurve für die aktuellen Gegebenheiten anzuzeigen: https://www.desmos.com/calculator/szgyrui9ng. Im Gegensatz zur üblichen Darstellung von Heizkurven (sinkende Temperaturen nach rechts) verwendet dieser Graph die übliche mathematische Konvention (sinkende Werte nach links).


Für die berechnete Vorlauftemperatur gelten Grenzwerte. Im oberen Bereich begrenzen verschiedene Parameter die Kessel- und Vorlauftemperatur, beispielsweise die Maximaltemperaturbegrenzung (Parameter 13 in den MISCHER-Einstellungen auf Expertenniveau). Im unteren Bereich liegt die „Grenztemperatur für die Sommerabschaltung“ (Parameter 04 in den SYSTEM-Einstellungen auf Expertenniveau). Dieser Parameter ist mit der Gebäudeart (Parameter 10 in den SYSTEM-Einstellungen auf Expertenniveau) kombiniert, welche den Zeitraum für die Mittelung der Außentemperatur festlegt (z. B. 8 Stunden für eine mittelschwere Bauweise).

Neben den einstellbaren Grenzwerten der Vorlauftemperatur gibt es auch praktische Erwägungen, die ihren Wert begrenzen. Mein Wärmeerzeuger hat eine maximale Leistung von 12,0 kWh und eine maximale Gasmenge von 1,3 m³/h (wobei 1 m³ Gas etwa 10,6 kWh entspricht); diese Werte entsprechen einem Wirkungsgrad des Wärmeerzeugers von ca. 90 %. Die Summe der Durchflussmengen durch meine Abgleichventile ergibt einen Wert von ca. 12 Litern pro Minute (ca. 700 Liter bzw. 700 kg bzw. 0,7 m³ Wasser pro Stunde); der Wert von 0,7 m³ ist auch mit der Regelkennlinie der Pumpe kompatibel (Proportionaldruck, Stufe II, 13 W). Unter Anwendung von:

Q=mcTQ=m∙c∙∆T

wo:

Q12000 WQ ≈ 12000 \text{ }W
c4184 Jkg1K1(Wasser bei Raumtemperatur)c ≈ 4184 \text{ }J⋅kg^{-1}\cdot{K^{-1}} \text{ (Wasser bei Raumtemperatur)}
m700 kg/3600 sm ≈ 700 \text{ kg} / 3600 \text{ s}

ergibt eine Temperaturspreizung (∆T) von:

T15 °C∆T≈15 \text{ °C}

Sobald die Differenz zwischen Vorlauf- und Rücklauftemperatur 15 °C erreicht (in meinem Fall), arbeitet der Wärmeerzeuger mit voller Leistung und kann die Vorlauftemperatur nicht weiter erhöhen (vorausgesetzt, die Bedingungen, z. B. Innen- und Außentemperatur, bleiben gleich). Eine Erhöhung der Raum-Solltemperatur der Heizkurve wäre sinnlos, da der Wärmeerzeuger bereits maximal arbeitet.

Es gibt eine zusätzliche Einstellung „Anpassung der Heizkurve“ (Parameter 05 in den MISCHER-Einstellungen auf Expertenebene), die viele der zuvor genannten Einstellungen überschreibt, um „gute“ Werte zu ermitteln. Diese Einstellung soll offenbar für einige Tage aktiviert bleiben, und die ermittelten Werte sollen dann manuell in die anderen Einstellungen übertragen werden, bevor der Anpassungsprozess deaktiviert wird.

Zurück zum Thema Raumfühler: Standardmäßig (Werkseinstellung) handelt es sich lediglich um einen Messwert, der angezeigt werden kann. Wie bereits erwähnt, kann sein Wert in die Berechnung der Vorlauftemperatur einfließen. Darüber hinaus gibt es zwei weitere Verwendungsmöglichkeiten: als Raumthermostat (Parameter 09 in den MISCHER-Einstellungen auf Expertenniveau), um die Heizung bei Erreichen der Raum-Solltemperatur abzuschalten, und als Ersatz für die Außentemperaturmessung (im PID-Regelmodus (Proportional Integral Differential)).

Ein Blick auf die Einstellungen des Heizungsreglers offenbarte mir, dass ich noch weiteren Missverständnissen unterworfen war.

Ich war davon ausgegangen, dass außerhalb der Schaltzeiten die RAUM-NACHT-Temperatur (Absenktemperatur) die Raum-Solltemperatur bestimmt und dass die Vorlauftemperatur entsprechend gesteuert wird. Allerdings gibt es eine Einstellung „Art des reduzierten Betriebs“ (Parameter 01 in den MISCHER-Einstellungen auf der Expertenebene), die die Werte ECO (Abschaltbetrieb frostgesichert) oder ABS (Absenkbetrieb) annehmen kann, wobei ECO die Voreinstellung ist. ECO bedeutet, dass nicht geheizt wird, es sei denn, die Raum-Isttemperatur unterschreitet die Raumfrostschutzgrenze (Parameter 08 in den MISCHER-Einstellungen auf der Expertenebene), die den Standardwert 10 °C hat und in meinem Heizungsregler auf 15 °C eingestellt ist.

Ich war weiter davon ausgegangen, dass das System zur Einschaltzeit mit dem Heizen beginnt. Es gibt jedoch die Einstellung „Einschaltoptimierung“ (Parameter 06 in den MISCHER-Einstellungen auf Expertenebene), die das Heizen vor der Einschaltzeit startet und somit die Interpretation von „Heizbeginn“ zu „Belegungsbeginn“ (d.h. den Zeitpunkt, zu dem die gewünschte Raumtemperatur erreicht ist) ändert. Die Standardeinstellung beträgt 1 Stunde, die tatsächliche Dauer berechnet sich jedoch wie folgt:

(TTATTZ)t\left(\frac{T-T_{A}}{T-T_{Z}}\right)\cdot t

Dabei ist T die Raum-Solltemperatur inklusive Raumeinflussfaktor, TA die Außentemperatur und t die konfigurierte Einschaltoptimierung. Bei mir bedeutet das typischerweise, dass die Heizung 30 bis 40 Minuten früher anspringt als erwartet.

Im Bereich „FUEHLER-ABGL“ (Fühler-Abgleich) finden sich außerdem verschiedene Einstellungen, mit denen sich Temperaturmesswerte korrigieren lassen. Die meisten dieser Einstellungen sind auf Herstellerebene (OEM) angesiedelt, was bedeutet, dass die Werte nur geprüft, aber nicht verändert werden sollten. Lediglich die Korrektur der Raumfühlertemperatur ist eine Einstellung auf Expertenebene.

Ich habe auch eine Funktion des Heizungsreglers übersehen, die es ermöglicht, zusätzliche Informationen anzuzeigen; durch Gedrückthalten einer Taste beim Anzeigen eines Wertes wird ein alternativer, aber verwandter Wert angezeigt – typischerweise der Sollwert anstelle des Istwerts, der normalerweise angezeigt wird.

Zwei verschiedene Dokumente beschreiben die Funktion folgendermaßen:

·         Sämtliche angezeigten Temperaturwerte stellen die momentanen Werte dar. Durch gedrückthalten des Dreh-Drück-Knopfes wird der jeweilige Sollwert angezeigt.

·         Wird bei einer aufgerufener Anlagentemperatur der Drehknopf gedrückt, erscheint der zugehörige Sollwert links neben dem aktuellen Wert (Istwert) im Display. Ausnahme: Außentemperatur gemittelter Wert.

Wednesday, 17 July 2024

Windows high-resolution time stamps (QueryPerformanceCounter)

The Microsoft article “Acquiring high-resolution time stamps” provides a good deal of useful information about high-resolution timing under Windows and Simon Anciaux provided the source code of a routine that matches the Windows 11 QueryPerformanceCounter routine (QueryPerformanceFrequency returning 10mhz bug).

Nonetheless, there is more that can be said and that might provide additional reassurance that one’s understanding of high-resolution timing is accurate.

There is a Windows kernel variable named HalpRegisteredTimers that points to a linked list of registered timers; the routine that registers the timers is passed a pointer to a TIMER_INITIALIZATION_BLOCK (whose format is defined in file nthalext.h). On my PC, 8 timers appear in the list and one can infer some interesting values from the information in the calls to register timers.

The KnownType values are members of the enum KNOWN_TIMER_TYPE (also defined in nthalext.h). In case the names are not immediately recognizable, here are some brief descriptions:

TimerProcessor: the processor Time-Stamp Counter (TSC).

TimerART: the processor Always Running Timer, perhaps mostly commonly encountered in conjunction with “Intel Processor Trace”.

TimerApic: Advanced Programmable Interrupt Controller (APIC).

TimerAcpi: Advanced Configuration and Power Interface (ACPI) power management timer.

TimerCmosRtc: the Real Time Clock (RTC).

TimerHpet: High Precision Event Timer (HPET)

KnownType

Identifier

CounterBitWidth

CounterFrequency

Capabilities

TimerProcessor

0

64

1992000000
(24000000
× 166 / 2)

1991998774
1991998859

TIMER_PER_PROCESSOR
TIMER_COUNTER_READABLE
TIMER_COUNTER_WRITABLE
TIMER_SUPPORTS_ADVANCED_QUERY

TimerART

0

64

24000000

24000038
24000039

TIMER_PER_PROCESSOR
TIMER_COUNTER_READABLE
TIMER_ALWAYS_ON
TIMER_AUXILIARY

TimerApic

0

32

187500
(24000000 / 128)

187500
187497

TIMER_PER_PROCESSOR
TIMER_COUNTER_READABLE
TIMER_ONE_SHOT_CAPABLE
TIMER_PERIODIC_CAPABLE
TIMER_GENERATES_INTERNAL_INTERRUPTS

TimerAcpi

0

24

3579545

TIMER_COUNTER_READABLE

TimerCmosRtc

0

64

2048

TIMER_PSEUDO_PERIODIC_CAPABLE
TIMER_ONE_SHOT_CAPABLE
TIMER_SINGLE_INIT

TimerHpet

0

32

24000000

TIMER_COUNTER_READABLE

TimerHpet

1

31

24000000

TIMER_PSEUDO_PERIODIC_CAPABLE
TIMER_ONE_SHOT_CAPABLE
TIMER_PERIODIC_CAPABLE
TIMER_GENERATES_8259_INTERRUPTS

TimerHpet

2

31

24000000

TIMER_PSEUDO_PERIODIC_CAPABLE
TIMER_ONE_SHOT_CAPABLE
TIMER_GENERATES_8259_INTERRUPTS

 

When registering the timers on my PC, the CounterFrequency of the TimerProcessor, TimerART and TimerApic timers is specified as zero; actual values are determined by comparing these timers to another timer whose nominal frequency is considered to be reliable. Windows has a preference list of which timer should be used as a reference and the first choice is TimerAcpi; on my PC, this timer is available and is used. The timers are compared over a period of one eighth of a second (125 milliseconds).

Section 19.7.3 (“Determining the Processor Base Frequency”) of Intel 64 and IA-32 Architectures Software Developer's Manual Volume 3 contains a table of “Nominal Core Crystal Clock Frequency”; for my PC, the value is 24 MHz. The CPUID instruction with EAX set to 15H (Time Stamp Counter and Nominal Core Crystal Clock Information Leaf) returns the values 2 as the denominator of the TSC/”core crystal clock” ratio and 166 as the numerator of the TSC/”core crystal clock” ratio; the nominal frequency of the core crystal clock in Hz is not enumerated on my CPU.

These values allow a nominal counter frequency for TimerProcessor to be calculated (1992000000 Hz). The other values in the table above (1991998774 and 1991998859) are the results of two runs of the measuring process against the ACPI PM timer.

TimerART runs at the “Nominal Core Crystal Clock Frequency” (24000000 Hz); again, the other values in the table above (24000038 and 24000039) are the results of two runs of the measuring process against the ACPI PM timer. Although there is no instruction to read the ART (Windows determines its value by evaluating (__rdtscp() - __rdmsr(IA32_TSC_ADJUST)) * CPUID.15H:EAX[31:0] / CPUID.15H:EBX[31:0]), it is measured separately.

I have assumed that the APIC timer also has a nominal frequency of 24 MHz and a divider of 128, giving a nominal frequency of 187500 Hz; the other values (187500 and 187497) are the measurement results.

In the simplest case, the value returned by QueryPerformanceCounter is the result of executing the following calculations:

_umul128(__rdtscp(out _), HypervisorSharedUserData.Factor, out ulong qpc);

qpc += HypervisorSharedUserData.Bias;

qpc += SharedUserData.QpcBias;

qpc >>= SharedUserData.Qpc.QpcShift;

SharedUserData is intended as a reference to the KUSER_SHARED_DATA structure; HypervisorSharedUserData is intended as the reference returned by the call NtQuerySystemInformation(SystemHypervisorSharedPageInformation, …); HypervisorSharedUserData.Bias is zero and HypervisorSharedUserData.Factor is the integral result of evaluating:

CpuHz is the measured TimerProcessor frequency and the evaluation is performed as a _udiv128 style calculation.

Auxiliary Counter routines

Windows has a group of functions with names including the string “AuxiliaryCounter”, such as QueryAuxiliaryCounterFrequency, ConvertPerformanceCounterToAuxiliaryCounter. These functions are related to the timer with the capability TIMER_AUXILIARY, if such a timer is present; on my PC, this is TimerART. The documentation for the group of functions is not very informative; I could only find one use of the functions on my PC: in file IntcAudioBus.sys (FileDescription: “Intel® Smart Sound Technology (Intel® SST) Bus”).

TSC synchronization

The Microsoft-Windows-HAL Event Tracing for Windows provider records the synchronization of the TSC values between the processors. Here is a condensed summary from a synchronization run:

The processor cycle counter on processor 1 has been probed by processor 0. A counter delta of -199 was detected. The approximate communication delay between these processors was detected to be 508.

[…]

The processor cycle counter on processor 0 was synchronized against processor 4 using an adjustment of 94 cycles on attempt 0. This resulted in a delta of -13 cycles.

The processor cycle counter on processor 1 was synchronized against processor 0 using an adjustment of 342 cycles on attempt 0. This resulted in a delta of 68 cycles.

[…]

The processor cycle counter on processor 1 has been probed by processor 0. A counter delta of 10 was detected. The approximate communication delay between these processors was detected to be 500.

[…]

The processor's cycle counters have been successfully synchronized from processor 0 within acceptable operating thresholds. The maximum positive delta detected was 10 and the maximum negative delta was -11. Synchronization executed for 7773 microseconds.

If the processors cycle counters can be synchronized to within “acceptable operating thresholds” then it should be impossible for a thread, rescheduled on a different processor, to detect a backwards step in the TSC values.

Tuesday, 30 January 2024

Analyzing Windows heap usage with and without ETW

 

It has been a long time since I last wanted to discover if/where a program was “leaking” heap allocations. Most programs that I developed myself just performed some task and exited; heap allocations (from all sources, including Microsoft and other third party DLLs) probably rarely exceeded a few megabytes. I coded mostly with C# (garbage collected); most heap allocations directly under my control arose from native interop and I adopted an approach of releasing memory when it was “easy” and did not obscure the main intent of the code – otherwise I “intentionally” allowed the memory to leak.

I mention the above because I am a heavy user of Event Tracing for Windows (ETW) but I had hitherto no experience of using ETW (or, indeed, any other tool) to investigate heap usage. It was only when I tried to help with a problem/question in a technical forum that I had a need to understand heap usage. The question was whether the Windows Filtering Platform API FwpmNetEventEnum unavoidably leaks heap allocations.

The first approach that came to mind was to use the User-Mode Dump Heap (UMDH) utility from the Debugging Tools for Windows kit. However, the “current” version did not seem to work. Searching the web for explanations uncovered the following quotes for other users who had encountered the problem:

According to a Microsoft employee, this is a known problem. I quote: "Yeah. It's not working and I don't know when/if it will ever be."

I also quote an email I got from a Microsoft Support guy: "Anyway, I have confirmation it is broken. The dev team owning the exe knows about it and when they can get to fixing it they will."

Fortunately older versions of UMDH still work and it quickly became apparent that FwpmNetEventEnum does leak heap allocations. Most Fwpm* routines use RPC to the Base Filtering Engine (BFE) service to perform their function. Those Fwpm* APIs that return complex data structures mostly use a [allocate(all_nodes)] attribute in the MIDL ACF (Application Configuration File) so that the data can be freed with a single call to midl_user_free; however, that attribute was not applied to the RPC routine at the core of FwpmNetEventEnum. A subsequent call to FwpmFreeMemory just frees the top-level allocation and not the additional embedded allocations.

The absence of the [allocate(all_nodes)] attribute could be confirmed with tools that dump embedded RPC data structures; one example of a heap allocation back-trace that demonstrated that complex data structures were being allocated node-by-node was:

ntdll!RtlpAllocateHeapInternal+0x80B4E

fwpuclnt!MIDL_user_allocate+0x19

RPCRT4!NdrSafeAllocate+0x47

RPCRT4!Ndr64ComplexStructUnmarshall+0x72D

RPCRT4!Ndr64EmbeddedPointerUnmarshall+0x366

RPCRT4!Ndr64UnionUnmarshall+0x2D9

RPCRT4!Ndr64ComplexStructUnmarshall+0x5F4

RPCRT4!Ndr64pPointerLayoutUnmarshallCallback+0x234

RPCRT4!Ndr64ConformantArrayUnmarshall+0x21C

RPCRT4!Ndr64TopLevelPointerUnmarshall+0x40F

RPCRT4!Ndr64TopLevelPointerUnmarshall+0x59D

RPCRT4!Ndr64pClientUnMarshal+0x2A1

RPCRT4!NdrpClientCall3+0x40C

RPCRT4!NdrClientCall3+0xEB

fwpuclnt!FwpmNetEventEnum5+0x70


Heap Snapshots

I then turned my thoughts to understanding what type of bug could have been introduced into UMDH. There are several methods of obtaining the information needed to dump heap snapshot information (including heap allocation back-traces) about a process; the routines RtlQueryProcessDebugInformation and RtlQueryHeapInformation can both independently obtain the necessary information. UMDH seems to have taken a different approach and used the routine ReadProcessMemory and a knowledge of NTDLL internal data structures to gather the information.

The failing version of UMDH seems to have started using RtlQueryHeapInformation (with an HEAP_INFORMATION_CLASS value of HeapExtendedInformation (2)) to obtain information about heap allocations, but this information does not include any data that can be used to associate the allocation with a back-trace. There is, however, a HEAP_INFORMATION_CLASS value (5, let’s name it HeapStackTraceInformation) that returns information well suited for use by UMDH (i.e. includes information about allocated heap blocks and back-traces for the allocations).

The back-traces returned by RtlQueryHeapInformation for HeapStackTraceInformation come from a different source compared to the back-traces created and store when the Global Flag FLG_USER_STACK_TRACE_DB is set. The back-traces used by RtlQueryHeapInformation are enabled and disabled by RtlSetHeapInformation (also with a HEAP_INFORMATION_CLASS value of 5) or by creating a value named “FrontEndHeapDebugOptions” under the Image File Execution Options (IFEO) key for an image; this value can be set by the Windows Performance Recorder (WPR) command “wpr -snapshotconfig heap –name […]” (“wpr -snapshotconfig heap –pid […]” effectively calls RtlSetHeapInformation).

When comparing the two versions of the back-trace information for a given allocation, they mostly just differ in the first frame:

HeapStackTraceInformation:

ntdll!RtlpAllocateHeapInternal+0x80b49:

e8528d0500      call    ntdll!RtlpHpStackTraceAddStack

 

FLG_USER_STACK_TRACE_DB:

ntdll!RtlpAllocateHeapInternal+0x809dd:

e8ac1cffff      call    ntdll!RtlpCallInterceptRoutine

The back-traces can also differ in the depth of the back-trace captured and stored (HeapStackTraceInformation can save more frames).

“wpr -singlesnapshot heap […]” uses EnableTraceEx2 to send an EVENT_CONTROL_CODE_CAPTURE_STATE to the Microsoft-Windows-Heap-Snapshot provider, using the EnableFilterDesc field of the EnableParameters parameter to select the “pids”. This causes RtlQueryHeapInformation with HeapStackTraceInformation to be executed in the target processes with the output being broken into chunks and logged into the trace session. Windows Performance Analyzer (WPA) can reassemble, analyze and display this data in a “Heap Snapshot” graph.

Heap Events

WPR provides another heap related command: “wpr -heaptracingconfig […]”. This command creates/sets another value under IFEO – namely TracingFlags. These flags enable aspects of the User Mode Global Logger (UMGL), including events generated by the WMI HeapTraceProvider; this provider generates events for individual heap events (HeapRangeCreate, HeapRangeReserve, HeapRangeRelease, HeapRangeDestroy, HeapCreate, HeapAllocation, HeapReallocation, HeapDestroy, HeapFree and more) and StackWalk back-traces can be configured for selected event types. WPA knows how to analyze and display these events too (in various graphs in the Memory category).

The instrumentation for these events is obviously embedded in many NTDLL heap routines; for the HeapAllocation event, the instrumentation is embedded close to the heap stack tracing calls:

ntdll!RtlpAllocateHeapInternal+0x80aec:

e817a30500      call    ntdll!RtlpLogHeapAllocateEvent

If a process was started without heap tracing enabled via IFEO, heap tracing can still be enabled by directly setting the heap tracing bit in the _PEB.TracingFlags field (perhaps via a debugger); there does not seem to be any API that performs this function.

Monday, 26 June 2023

Exploring Identity Privacy, TEAP and TTLS in conjunction with Microsoft Network Policy Server

 

Windows 10/11 clients include at least three EAP (Extensible Authentication Protocol) methods than can both be used in conjunction with 802.1X or VPN protocols and also support identity privacy/protection: PEAP (Protected EAP), TEAP (Tunnel EAP) and EAP-TTLS (EAP Tunneled Transport Layer Security).

On the server side, NPS (Network Policy Server) only supports PEAP natively but additional EAP methods can be plugged in (either as legacy methods or EAPHost based methods).

Enabling identity privacy on the client side is straightforward: just enable the feature and optionally specify a name to be used as the outer (visible) identity. How to “enable” identity privacy on the server side is less obvious; I have used quotation marks around “enable” because identity privacy is not explicitly enabled – one just has to ensure that the NPS configuration is compatible with identity privacy.

NPS has two types of policies: Connection Request Policies (CRP) and Network Policies (previously known as Remote Access Policies (RAP)). The policy type names don’t clearly demarcate their intended purposes. The Microsoft documentation says:

Connection request policies are sets of conditions and settings that allow network administrators to designate which Remote Authentication Dial-In User Service (RADIUS) servers perform the authentication and authorization of connection requests that the server running Network Policy Server (NPS) receives from RADIUS clients. Connection request policies can be configured to designate which RADIUS servers are used for RADIUS accounting.

Network policies are sets of conditions, constraints, and settings that allow you to designate who is authorized to connect to the network and the circumstances under which they can or cannot connect.

Configuring authentication (allowed authentication mechanisms, etc.) seems like something that should be part of the network policy and this indeed is where it is typically defined. However, when searching Microsoft documentation for help on configuring identity privacy, only this short “tip” is readily findable:

Tip

The NPS policy for 802.1X Wireless must be created by using NPS Connection Request Policy. If the NPS policy is created in by using NPS Network Policy, then identity privacy will not work.

The implications of this simple statement are also not immediately obvious. Awareness of the availability of a setting named “Override network policy authentication settings” suggests a possibility:

[…] if the option to Override network policy authentication settings is enabled on the Settings tab in a connection request policy, then authentication is performed in connection request policy. Otherwise, authentication is performed in network policy. Authentication can be configured in both types of policies.

RADIUS requests to NPS are processed by a “pipeline” of stages, defined at HKLM\SYSTEM\CurrentControlSet\Services\RemoteAccess\Policy\Pipeline; the stages (in order of evaluation) are:

 

Stage

Providers

Reasons

Replays

Requests

Responses

1

IAS.ProxyPolicyEnforcer

 

 

 

0 1 2

0 1 2 3 4

2

IAS.Realm

1

 

 

0 1

0

3

IAS.Realm

0 2

 

 

0 1

0

4

IAS.NTSamNames

1

 

 

0

0

5

IAS.CRPBasedEAP

1

 

 

0 2

0

6

IAS.Realm

1

 

0

0

0

7

IAS.NTSamNames

1

 

0

0

0

8

IAS.MachineNameMapper

1

 

0

0

0

9

IAS.BaseCampHost

 

 

0

 

 

10

IAS.RadiusProxy

2

 

0

 

0

11

IAS.ExternalAuthNames

2

 

0

 

0

12

IAS.NTSamAuthentication

1

 

0

0

0 1 2

13

IAS.UserAccountValidation

1 3

33

0

0

0 1

14

IAS.MachineAccountValidation

1

 

0

0

0 1

15

IAS.EAPIdentity

1

 

0

0

0 1

17

IAS.PolicyEnforcer

1 3

33

0

0

0 1

18

IAS.NTSamPerUser

1 3

33

0

0

0 1

19

IAS.URHandler

1 3

33

0

0

0 1

20

IAS.RAPBasedEAP

1

 

0

0 2

0

21

IAS.PostEapRestrictions

0 1 3

 

0

0

0 1

23

IAS.ChangePassword

1

 

0

0

1

24

IAS.AuthorizationHost

 

 

0

 

 

25

IAS.EAPTerminator

0 1

 

0

0 2

1 2 3 5

26

IAS.DatabaseAccounting

 

 

 

 

 

27

IAS.Accounting

 

 

 

 

 

28

IAS.MSChapErrorReporter

0 1 3

 

0

0

2

Providers: 0 None, 1 Windows, 2 RADIUS Proxy, 3  External Authentication

Reasons: 33 → PASSWORD_MUST_CHANGE

Replays: 0 → FALSE

Requests: 0 → ACCESS_REQUEST, 1 → ACCOUNTING, 2 → CHALLENGE_RESPONSE

Responses: 0 → INVALID, 1 → ACCESS_ACCEPT, 2 → ACCESS_REJECT, 3 → ACCESS_CHALLENGE, 4 → ACCOUNTING, 5 → DISCARD_PACKET

Each stage gets a chance to handle the request, if the current request state (provider, reason, replays, requests, responses) allows.

If the “outer” identity does not exist, stage 13 (IAS.UserAccountValidation) reports reason NO_SUCH_USER.

If the “outer” identity exists but is disabled, stage 13 (IAS.UserAccountValidation) reports reason ACCOUNT_EXPIRED.

If the “outer” identity is usable but differs from the “inner” identity, stage 20 (IAS.RAPBasedEAP) reports problem ERROR_PEAP_IDENTITY_MISMATCH.

If the authentication settings are set on the Connection Request Policy then stage 5 (IAS.CRPBasedEAP) gets a chance to handle the request and influence its handling by later pipeline stages.

The NPS log file entries for a PEAP-MSCHAPv2 authenticated VPN session with identity privacy contain the following usernames:

User-Name = "anonymous"
User-Name = "anonymous"
User-Name = "anonymous"
User-Name = "anonymous"
User-Name = "anonymous"
User-Name = "anonymous"
User-Name = "anonymous"
User-Name = "GARY"
User-Name = "GARY"
User-Name = "GARY"
User-Name = "GARY"
User-Name = "anonymous"
User-Name = "anonymous"

There are 24 entries in total (the “challenge response” entries don’t have a username value). The initial EAP and EAP TLS establishment entries contain the “outer” identity, the inner MSCHAPv2 exchanges contain the “inner” identity and the accounting start/stop entries use the “outer” identity.

Implementing “toy” server-side (authenticator) TEAP and EAP-TTLS support

Each client-side (supplicant) EAP method includes a “properties” value in the registry; below is a summary of five of the methods installed by default in Windows 10/11.

Property

Value

TEAP

PEAP

EAP-TTLS

EAP-TLS

MSCHAPv2

PropCipherSuiteNegotiation

0x00000001

X

X

X

X

 

PropMutualAuth

0x00000002

X

X

X

X

X

PropIntegrity

0x00000004

X

X

X

X

X

PropReplayProtection

0x00000008

X

X

X

X

X

PropConfidentiality

0x00000010

X

X

 

 

 

PropKeyDerivation

0x00000020

X

X

X

X

X

PropKeyStrength64

0x00000040

 

 

 

 

X

PropKeyStrength128

0x00000080

X

X

X

X

 

PropKeyStrength256

0x00000100

 

 

 

 

 

PropKeyStrength512

0x00000200

 

 

 

 

 

PropKeyStrength1024

0x00000400

 

 

 

 

 

PropDictionaryAttackResistance

0x00000800

X

X

X

X

 

PropFastReconnect

0x00001000

X

X

X

X

 

PropCryptoBinding

0x00002000

X

X

 

 

 

PropSessionIndependence

0x00004000

X

X

X

X

X

PropFragmentation

0x00008000

X

X

X

X

 

PropChannelBinding

0x00010000

 

 

 

 

 

PropNap

0x00020000

 

X

 

 

 

PropStandalone

0x00040000

X

X

X

 

X

PropMppeEncryption

0x00080000

X

X

X

X

X

PropTunnelMethod

0x00100000

X

X

X

 

 

PropSupportsConfig

0x00200000

X

X

X

X

X

PropCertifiedMethod

0x00400000

X

 

 

 

 

PropHiddenMethod

0x00800000

 

 

 

 

 

PropMachineAuth

0x01000000

X

X

X

X

X

PropUserAuth

0x02000000

X

X

X

X

X

PropIdentityPrivacy

0x04000000

X

X

X

 

 

PropMethodChaining

0x08000000

X

 

 

 

 

PropSharedStateEquivalence

0x10000000

X

X

X

X

 

 

0x20000000

X

 

 

 

 

PropReserved

0x80000000

 

 

 

 

 

Based on experience analyzing PEAP behaviour, I thought that it would be feasible to implement server-side (authenticator) support for EAP-TTLS and TEAP; furthermore, I expected a good deal of commonality between the implementations, so I created a generic (in the C# sense) tunnel EAP class to handle common tasks (such as establishing the TLS tunnel using Schannel, EAP identity, EAP negotiation, EAP fragmentation) which could be specialized by types that could handle method specific tasks (such as TLV or AVP encapsulation, master session key generation, etc.).

There are two development frameworks for EAP methods: legacy EAP and EAPHost. Because the implementation would be tunneling EAP-MSCHAPv2 (which is a legacy EAP implementation) and because PEAP is also a legacy EAP implementation, I chose to use the legacy EAP framework too.

Implementing EAP-TTLS without support for identity privacy was straightforward. It (more specifically EAP-TTLSv0) does not support cryptobinding, so all that is required is packing and unpacking EAP messages in TTLS AVP (attribute-value pair) format and calculating/obtaining the master session key (available via SetContextAttributes /QueryContextAttributes with SecPkgContext_EapPrfInfo/SecPkgContext_EapKeyBlock and “EAP-TTLSv0 Keying Material”).

Adding identity privacy support involves use of undocumented (or poorly documented) features of the Windows EAP frameworks. In particular, the “action” EAPACTION_IndicateIdentity (documentation: “Reserved for system use”) or its EAPHost equivalent EAP_METHOD_AUTHENTICATOR_RESPONSE_AUTHENTICATE (documentation: “The authenticator method has started authentication of the supplicant” – at best unhelpful and possibly totally incorrect). It seems that although frameworks for new EAP methods are supported, it was not expected that new “tunnel” EAP methods would be needed.

The EAPACTION_IndicateIdentity action allows NPS to be informed of the inner identity which can then be passed on to NPS policy stages such as IAS.UserAccountValidation that might otherwise fail the authentication on the basis of the outer identity.

TEAP

The TEAP RFC is twice the length of the EAP-TTLS RFC (100 pages vs. 50 pages) and there is also quite a lengthy “Errata” report for the TEAP RFC. Before reading the errata, I puzzled over whether an Intermediate-Result TLV was necessary if there was only one inner method: the RFC is ambiguous (with a tendency to “not necessary”), the errata says that an Intermediate-Result TLV is mandatory and the Windows TEAP client expects such a TLV. I also make a mistake with the “MSK Compound MAC” in the crypto-binding TLV (not truncating the value when, for example, SHA-384 is used instead of SHA-1).

Fortunately the ETW tracing for TEAP (provider Microsoft.Windows.Eap.Teap) is very helpful. This highlighted extract shows some of the useful information in the trace data:


The keying information from the TLS tunnel that is needed for the TEAP cryptographic calculations can be obtained via SetContextAttributes /QueryContextAttributes with SecPkgContext_KeyingMaterialInfo/SecPkgContext_KeyingMaterial.