In Detail: 2019

Tuesday, 10 December 2019

Implementing an IKEv2 VPN client under Windows 10 VPN

I have previously written about the difficulties of implementing a Cisco AnyConnect VPN client using just the Extensible Authentication Protocol (EAP) framework and interfaces in Windows 10: most of what needs to be done to enable the establishment of VPN connections to an AnyConnect server can be implemented as an EAP authentication mechanism, but some requirements cannot be fulfilled within this framework. These requirements are: use of IKEv2 Vendor ID payloads, control of the IDi Identification payload and access to IKEv2 messages and derived keys for the AnyConnect authentication computations.

The first two messages in the establishment of an IKEv2 security association are sent in plain text, so one can see their content with a simple network sniffer, but subsequent messages are encrypted and if the “protocol” is not fully documented (e.g. the AnyConnect EAP mechanism) then one needs to somehow see these encrypted messages in plain text. There are several ways to do this, but if one has access to an original client and server then one way is particularly useful – develop server and client replacements in “steps”. By “steps” I mean implement the known steps of the protocol and then use the original client/server in conjunction with the new server/client to capture the next unknown packet in plain text.

For example, at the start of the process one does not know what the client will send in its second message (its first encrypted message), so one can use the original client to connect to the new server (that, at first, only knows how to handle the first, unencrypted, message exchange). After seeing what the original client sends, one can use the new client to send an equivalent message to an original server and see what it returns in plain text. With the exception of the detailed cryptographic computations used in the authentication process, one can derive an understanding of the full protocol exchanges using this technique.

At the end of the process, one has a full keying module for the protocol but, having served its purpose as a way of understanding the protocol, it is of little practical use without two other components: an implementation of the Encapsulating Security Payload (ESP) protocol and a mechanism for creating the operating system “objects” (such as network interfaces) that can be assigned IP addresses, appear in routing tables, etc..

It is perhaps not widely known, but the Windows Filtering Platform (WFP) Base Filtering Engine (BFE) contains all of the functionality necessary for the first the first task (using ESP to transport data) and is documented and fully supported. Two documentation entries in Using Windows Filtering Platform are particularly useful: Using Tunnel Mode and Manual SA Keying.

At this stage, it suffices to say that the first task can be implemented using documented APIs and the C# programming language (with quite a large investment of time in creating C# declarations of the WFP data structures). But before expanding on this topic, it would be reassuring to know that the second task (creating a “network interface”) can also be achieved. There appears to be no documented API for this, but the Microsoft IKEv2 implementation (part of IKEEXT) uses the DLL “vpnikeapi”. The DLL exports of vpnikeapi look very promising:

CancelProcessEapAuthPacket

CloseTunnel

CreateTunnel

FreeConfigurationPayloadBuffer

FreeEapAuthAttributes

FreeEapAuthPacket

FreeIDPayloadBuffer

FreeTrafficSelectors

GetConfigurationPayloadRequest

GetIDPayload

GetOptionalIDrPayload

GetServerEapAuthRequestPacket

GetTrafficSelectorsRequest

NewRasIncomingCall

ProcessAdditionalAddressNotification

ProcessConfigurationPayloadReply

ProcessConfigurationPayloadRequest

ProcessEapAuthPacket

ProcessTrafficSelectorsReply

ProcessTrafficSelectorsRequest

QueryEapAuthAttributes

RemoveTrafficSelectors

TunnelAuthDone

UpdateTunnel

This looks just like the set of routines that would be needed to integrate IKEv2 into an operating system as both a client and a server. The IKEv2 payloads Configuration, Identification, Traffic Selector and EAP are just those that have relevance to the operating system, “ProcessAdditionalAddressNotification” sounds as though it supports MOBIKE and there are routines to create and close a “tunnel”.

Unfortunately, there is one preparatory step that must be performed before using these routines and the implications of that step make the vpnikeapi DLL unsuitable for my purposes. Why that is the case will become apparent later.

Flow of control and data in the Windows IKEv2 VPN implementation

The RasDial routine is the entry point for the flow of control for the establishment of a VPN connection and %APPDATA%\Microsoft\Network\Connections\Pbk\rasphone.pbk is the source of the configuration data for the connection. RasDial communicates with the “RasMan” (Remote Access Connection Manager) service via RPC and most of the operating system specific “action” takes place in that service; RasMan communicates with the IKEEXT (IKE and AuthIP IPsec Keying Modules) service for IKE/IKEv2/AuthIP protocol specific functionality.

When the IKEEXT service starts up, it uses the undocumented WFP routine IPsecKeyModuleAdd0 to register itself as the keying module provider for IKE, IKEv2 and AuthIP; the registration includes call-back routines to acquire and expire Security Associations. When RasMan needs the services of a registered keying module, it uses the undocumented routine IPsecSaInitiateAsync0 to initiate the behaviour (by invoking one of the registered call-back routines) and the keying module uses the routine IPsecKeyModuleUpdateAcquire0 to feedback progress. IPsecKeyModuleDelete0 rounds out the small set of keying module routines.

Some information is passed to the keying module in the parameters of the call-back routine, but this information is largely references to information stored dynamically in WFP in the form of “provider contexts”. These “provider contexts” are created from information specified in the mainModeTunnelPolicy, tunnelPolicy and keyModKey arguments to the FwpmIPsecTunnelAdd2 routine:

DWORD FwpmIPsecTunnelAdd2(

HANDLE engineHandle,

UINT32 flags,

const FWPM_PROVIDER_CONTEXT2 *mainModePolicy,

const FWPM_PROVIDER_CONTEXT2 *tunnelPolicy,

UINT32 numFilterConditions,

const FWPM_FILTER_CONDITION0 *filterConditions,

const GUID *keyModKey,

PSECURITY_DESCRIPTOR sd

);

The routines exported by vpnikeapi (but implemented by vpnike.dll in the RasMan service) are only available for use when a VPN establishment has been initiated by RasDial or similar. rasphone.pbk does not directly define the keying module GUID to be used (instead, it indicates (via VpnStrategy) which type of VPN is preferred and RasMan chooses the appropriate keying module GUID), so while it would be easy to register a parallel/alternative IKEv2 keying module, it would be difficult to route any work to it. To make the situation even clearer, vpnike contains an explicit test that its client is IKEEXT (via the IKEEXT service SID), so even if one did register a new keying module and managed to persuade RasMan to use it and initialize the data structures needed for vpnikeapi use, the new keying module would fail the security tests performed by the vpnike DLL.

Short aside: VPN Connection IPsec Configuration

By default, the Windows IKEv2 VPN client makes 18 Main Mode Security Association (SA) Proposals and 6 Quick Mode SA proposals. This default list of proposals can be strengthened and shortened by registry entries (such as NegotiateDH2048_AES256) or set (and limited) to one proposal by use of the Set-VpnConnectionIPsecConfiguration PowerShell cmdlet.

The flexibility of SA Proposals and Transform in IKEv2 means that often two proposals (one proposal with GCM/CCM mode cipher algorithms and one with CBC mode cipher algorithms) are adequate. However the WFP data structures used to communicate this information between RasMan and IKEEXT were probably designed before IKEv2 was standardized and do not support its flexibility – they are better suited to the IKE proposal model.

The format of proposals stored in rasphone.pbk exhibits the same problem (one transform of each type per proposal). They are stored as a sequence of serialized ROUTER_CUSTOM_IKEv2_POLICY_0 structures with the name “CustomIPSecPolicies”; “NumCustomPolicy” records how many proposals there are. Set-VpnConnectionIPsecConfiguration always creates a single custom IPsec policy, but manual editing of rasphone.pbk can be used to add more.

How to create a Network Interface?

This picture taken from the RAS Architecture Overview shows the search area for a solution to the problem of creating a network interface to “front up” (present an interface to the operating system) of the ESP secured (VPN) connection that I hoped to establish:

The Windows 10 IKEv2 client is sometimes known as the “Agile VPN” client and there is an agilevpn.sys WAN Miniport Driver which will probably be needed by any solution built from existing components. The diagram shows two potential user-mode interfaces that might provide the required functionality: RAS and TAPI (the Telephony API). However, I could not find any set of documented RAS or TAPI routines that could make any effective contribution to solving the problem (the routines that RasMan uses are not exported from any DLL (they are internal to vpnike.dll, rasmans.dll, etc.)).

The next step was to assume that the interface between the user-mode abstractions and the kernel-mode functionality was implemented with Device I/O Controls (IOCTLs). Tracing control and data flows across this boundary during VPN connection establishment suggested that IOCTLs to four different device drivers would be necessary:

\Device\NDProxy

\Device\NdisWan

\Device\AgileVPN

\Device\WANARP

This turned out to be almost correct, but one thing further was needed to complete the task: a call to the undocumented routine “NsiSetAllParameters”.

The security descriptors on the NDProxy and AgileVPN devices permit only access by SYSTEM; NdisWan grants access to SYSTEM and NETWORK SERVICE and WANARP grants some access to “Authenticated Users”.

NDProxy

As suggested by the architecture overview diagram, and in practice, NDProxy is the first point of contact.

After opening the device, one can issue a “connect” IOCTL, which returns the number of devices (numbered from zero to N - 1) which can be used via NDProxy; this is typically 5: the WAN Miniports SSTP, IKEv2, L2TP, PPTP and PPPOE. One can then iterate over these miniports, looking for one with a NDIS_WAN_MEDIUM_SUBTYPE of NdisWanMediumAgileVPN (i.e. IKEv2).

Having identified and opened the miniport, the next step is to make a “call” on it via a “query info” IOCTL with OID_TAPI_MAKE_CALL. The type of information needed to make the call includes the IP address of the VPN server, the network interface index via which the VPN server is reachable (the GetBestInterfaceEx API routine for the VPN server address returns the needed value) and the “tunnel ID”.

The “tunnel ID” is a 64 bit number that can be freely chosen (it should uniquely identify the tunnel, but normally there will be only one tunnel). It is used to link the network interface to the BFE SA Contexts (using the IPSEC_VIRTUAL_IF_TUNNEL_INFO0 structure).

In general, TAPI call establishment can take some time and RasMan uses asynchronous I/O to perform the call. Synchronous I/O works too and is OK for test purposes. Even with synchronous I/O, the call is not necessarily established when the I/O completes. There is an event reporting mechanism (line events) which report changes in call state, but polling the call state (waiting for LINECALLSTATE CONNECTED) works too.

Once the call has been established, an IOCTL with OID_TAPI_GET_ID can be used to retrieve the (connection) ID which is needed for a later step. This completes the initial interaction with NDProxy; nothing more needs to be done until the VPN connection is terminated, when one should drop (OID_TAPI_DROP) and close (OID_TAPI_CLOSE_CALL) the call and close the device (OID_TAPI_CLOSE).

AgileVPN

The call initiated via NDProxy triggers call set-up in AgileVPN, so AgileVPN is then ready for additional IKEv2 specific set-up – the first of which is informing AgileVPN of the IKEv2 Traffic Selectors.

The second, and final, IOCTL directly to AgileVPN is the command to create the VPN tunnel.

WANARP

A new network interface is still not visible at this point. One first has to issue an IOCTL to WANARP to obtain a network interface LUID for an interface of type IF_TYPE_PPP.

When the VPN connection is terminated and the network interface taken down, the LUID allocation should be freed (via another IOCTL).

NdisWan

Before performing the “activate route” IOCTL on NdisWan, it is necessary to choose a GUID for the network interface and associate this with a network compartment – this can be done via the undocumented NsiSetAllParameters routine.

The first IOCTL to NdisWan is used to map the connection ID obtained from NDProxy to a “bundle handle”. Once one has the bundle handle, one can activate the network interface and route to the network interface. In addition to the bundle handle, the IOCTL that activates the route requires the tunnel ID, the LUID obtained from WANARP, a name for the network interface, the GUID for the network interface and the IP address assigned by the VPN server to this client.

When the VPN connection is terminated and the network interface taken down, the route should be deactivated (via another IOCTL).

Finishing touches

Despite providing the assigned IP address to NdisWan, the network interface comes up with a link-local IP address (169.254.X.X). The correct address can be assigned via a call to the IP Helper API routine CreateUnicastIpAddressEntry.

The network interface comes up with a few routes preassigned (the VPN server, broadcast and multicast addresses), but it is useful to manually add either a route to the network reachable via the VPN or a default route via the VPN. This can be done with the IP Helper API routine CreateIpForwardEntry2.

Windows Filtering Platform / Base Filtering Engine

The FwpmIPsecTunnelAdd routine is just a convenience – its functionality can be mimicked by several calls to other WFP routines. For the purposes of this application, most of its parameters can be null/zero; only the engineHandle, flags and tunnelPolicy need specific values: a valid engineHandle, a flags value of FWPM_TUNNEL_FLAG_ENABLE_VIRTUAL_IF_TUNNELING and a dummy IKEv2 Quick Mode Tunnel provider context. The provider context must include (at least) one IPsec proposal – although this is not used (it need not correspond to any actually offered or negotiated proposal).

Only a few other calls to WFP routines are needed. Calls to IPsecSaContextCreate1 and IPsecSaContextGetSpi1 create a Security Association (SA) context and retrieve the SPI (Security Parameter Index). Once keying material has been negotiated/derived, calls to IPsecSaContextAddInbound1 and IPsecSaContextAddOutbound1 make this information available for use by ESP (Encapsulating Security Payload).

Implementation statistics

The entire implementation of a “working” VPN is less than 7000 lines of C#, of which around 1700 lines are just definitions of WFP structures and over 300 lines are just definitions of Diffie-Hellman primes and curves. The IKEv2 implementation is very “bare bones”, with plenty of missing functionality (such as rekeying).

Monday, 28 October 2019

Maximum number of VPN Incoming Connections under Windows 10

As a client rather than server version of Windows, there are software licensing limits on the use of Windows 10 as a VPN “server”. This article however focuses solely on the technical limitations on the number of concurrent “Incoming Connections”.

The plural form of the name “Incoming Connections” hints that more than one (hopefully concurrent) connection is possible and the initial configuration allows a pool of IP addresses to be reserved for assignment to VPN clients. If a pool is configured (typically more useful than the DHCP or “client chooses” options) then the size of the pool will be a limiting factor on the maximum number of concurrent connections.

The VPN server needs an IP address from the pool, as does each client, so the minimum pool size for one connection is two IP addresses. If there are not enough IP addresses available in the pool for a new client, then an error such as ERROR_NO_IP_ADDRESSES or ERROR_IPSEC_IKE_INNER_IP_ASSIGNMENT_FAILURE (depending on the VPN protocol being used) will be returned to the client. In the case of an IKEv2 VPN, a Notify payload carrying INTERNAL_ADDRESS_FAILURE informs the client about the nature of the problem.

Another factor that affects the maximum number of connections is the maximum number of WAN Miniport ports. The command “netsh ras show wanports” shows the maximum number of ports for each type of WAN Minport (e.g. SSTP, IKEv2, L2TP, PPTP) and, by default, the value is 2. If one could arrange that there were two clients of each type, this would allow up to eight concurrent connections.

Section 2.2.3.3.1 of [MS-RRASM] (Routing and Remote Access Server Management Protocol) describes the registry storage of the WAN Miniport configuration and two values are of particular interest: MaxWanEndpoints and WanEndpoints. The values shown by “netsh ras show wanports” actually correspond to WanEndpoints (“the number of endpoints or ports that the device type is configured with”); MaxWanEndpoints (“the maximum number of endpoints or ports that the device type can support”) has the value 3.

A command like “netsh ras set wanports device=“WAN Miniport (IKEv2)” maxports=3” can be used to increase the maximum number of connections of a particular type to three (a machine restart is needed before the new value takes effect). If a new client would cause the maximum number of ports to be exceeded then the server stops processing the request (perhaps waiting for a port to become free) and the client quickly times out (after about 5 seconds) its request, reporting an error code of ERROR_IPSEC_IKE_TIMED_OUT.

Point-to-Point Protocol (PPP)

Three of the four VPN tunnel types supported by Windows carry PPP; the three types that use PPP are SSTP, L2TP and PPTP. The PPP implementation in Windows 10 limits the total number of concurrent PPP sessions to one. If a new client would cause the number of current PPP sessions to exceed one then the server terminates the connection with error code ERROR_USER_LIMIT.

Some PPP clients tear-down (terminate in an orderly fashion) the VPN connection when disconnecting whereas others just abruptly stop communicating with the VPN server. In the latter case, a “zombie” PPP session remains on the server until a time-out causes it to be cleaned up. During this period (which typically lasts for a few minutes), no new PPP connections can be established.

IKEv2

IKEv2 does not use PPP and the number of concurrent IKEv2 connections is limited by the WAN Miniport and IP address pool factors. However, an out-of-the-box version of Windows 10 does not accept any IKEv2 connections since there are no allowable authentication mechanisms. The default value of ServerFlags ([MS-RRASM], section 2.2.3.4.6) disables both EAP and certificate authentication. Both can be enabled, but Windows 10 is missing the “EAP Host Authenticator” component, which is needed for EAP authentication.

Summary

For most users, the maximum number of VPN Incoming Connections is one. With appropriate configuration, a maximum number of four concurrent VPN Incoming Connections can be obtained (one SSTP/L2TP/PPTP connection and three IKEv2 connections with certificate authentication).

If these restrictions are too severe for the intended usage, one can install third-party VPN server software under Windows 10 or add a low-cost device (such as a Raspberry Pi) to the network configured to act as a VPN server.

Establishing a VPN connection from iOS to Windows 10

After experimenting with the built-in VPN functionality on a “hand-me-down” iPad running the latest/last iOS version for the hardware (12.4.2), there were enough things that were not immediately obvious to warrant a short note.

The iPad offers 3 “types” of VPN:

L2TP
IPSec
IKEv2

The first difficulty is the nomenclature of the types. “L2TP” (Layer 2 Tunneling Protocol, RFC 2661) can be used standalone, but is normally used in conjunction with IPsec and referred to as “L2TP/IPsec”. Indeed, the first option is actually L2TP/IPsec (with pre-shared key phase 1 authentication) and not plain L2TP. That naming brevity alone would not be so confusing if it were not for the fact that another option is “IPsec”.

The “IPsec” configuration shows a Cisco banner and allows a Cisco IPsec with XAUTH VPN (XAUTHInitPreShared or XAUTHInitRSA, https://tools.ietf.org/html/draft-beaulieu-ike-xauth-02) to be configured. Windows 10 does not natively support Cisco XAUTH VPNs.

Using L2TP over IPsec

The iPad L2TP/IPsec client sends 14 phase 1 transform proposals, at least one of which is acceptable to Windows 10. The authentication method is always “pre-shared key”, the encryption algorithms are AES-CBC-256, AES-CBC-128 and 3DES-CBC, the hash algorithms are SHA2-512, SHA2-256, SHA and MD5 and the group descriptions are 2048-bit MODP group, 1536-bit MODP group and alternate 1024-bit MODP group.

By default, Windows 10 chooses 3DES-CBC, SHA, and alternate 1024-bit MODP group. Windows 10 can be configured such that it chooses a more secure transform (via RasMan service parameters such as NegotiateDH2048 and NegotiateDH2048_AES256).

The iPad L2TP/IPsec client sends 6 phase 2 transform proposals, at least one of which is acceptable to Windows 10. The encryption algorithms are AES-CBC-256, AES-CBC-128 and 3DES-CBC, the HMAC algorithms are HMAC-SHA1-96 and HMAC-MD5-96.
By default, Windows 10 chooses AES-CBC-256 and HMAC-SHA1-96.

Shared Secret

As mentioned in an earlier article on the Windows 10 and macOS VPN combination, there is no straightforward user interface in Windows 10 that enables an L2TP/IPsec pre-shared key to be set. There is a “trick” that works or a short program can be written that calls the RRAS (Routing and Remote Access Service) API routine MprAdminInterfaceSetCredentialsEx to set the value.

Using IKEv2

When first viewing the iPad IKEv2 configuration dialog, the (required) “Remote ID” and (optional) “Local ID” are the first two items whose meaning and implication are not totally clear.

The “Remote ID” setting is used in two ways: it is used to verify that the identity asserted by the VPN server matches expectations and it is sent to the server (as an IDr Identification payload) to help the server select an identity (in case it has more than one, using redirection). A weakness of the iOS implementation is that it always seems to send the “Remote ID” with a type of “ID_FQDN” and represented as an ASCII string, although other values might be more appropriate (e.g. a type of ID_DER_ASN1_DN and a DER (Distinguished Encoding Rules) encoding of the value).

The “Local ID”, if set, is sent as the IDi Identification payload, with the same encoding problems as the “Remote ID”. If “Local ID” is not set, the local IP address is sent with the correct binary encoding and a type of ID_IPV4_ADDR or ID_IPV6_ADDR, as appropriate.

No Vendor ID payloads are sent and the following Notify payloads are sent:

First message → NAT_DETECTION_SOURCE_IP, NAT_DETECTION_DESTINATION_IP, IKEV2_FRAGMENTATION_SUPPORTED and REDIRECT_SUPPORTED

Second message → INITIAL_CONTACT, MOBIKE_SUPPORTED, ESP_TFC_PADDING_NOT_SUPPORTED, NON_FIRST_FRAGMENTS_ALSO

Authentication

Similar to macOS, iOS offers 3 options: None (IKEv2 non-EAP authentication), Username (EAP-MSCHAPv2) and certificate (EAP-TLS).

The EAP mechanisms require a replacement for missing functionality on Windows 10 (as discussed in the macOS VPN blog entry) but otherwise work – with one quirk. If one chooses Username authentication and leaves the Password field blank in the configuration (implying “Ask Every Time”) then the VPN connection attempt always fails; a dialog enabling entry of a password does pop up, but the entered password does not seem to be used (the connection just times out, without ever sending the MSCHAPv2 response and the iPad console log just reports “Failed to process IKE Auth (EAP) packet (connect)”).

The non-EAP IKEv2 authentication methods include RSA Digital Signature, Shared Key Message Integrity Code and ECDSA with SHA-256 on the P-256 curve (plus 384 and 512 bit equivalents). Via the configuration user interface, one can choose between certificate and pre-shared key methods. However, if the certificate method is chosen, the iOS client always indicates that the RSA Digital Signature method has been chosen, regardless of the type of certificate. The iOS Security Guide explicitly states: “IKEv2/IPSec with authentication by shared secret, RSA Certificates, ECDSA Certificates, EAP-MSCHAPv2, or EAP-TLS”.

For unknown reasons, iOS does not use the type of the configured certificate to set the authentication method, but instead uses a configuration setting that is not visible in the user interface (and which defaults to RSA). If one is prepared to create and install a Device Management Profile (a “.mobileconfig” file), one can set CertificateType to one of RSA, ECDSA256, ECDSA384, ECDSA512, or Ed25519 (https://developer.apple.com/documentation/devicemanagement/vpn/ikev2).

Security Association Proposals

By default, the iPad IKEv2 client sends 5 phase 1 transform proposals. The encryption algorithms include AES-CBC-256, AES-CBC-128 and 3DES-CBC, the hash algorithms include SHA2-256 and SHA and the Diffie-Hellman groups are 2048-bit MODP group, 256-bit random ECP group, 1536-bit MODP group and 1024-bit MODP group.

The iPad IKEv2 client sends 5 phase 2 transform proposals. The encryption algorithms include AES-CBC-256, AES-CBC-128 and 3DES-CBC, the hash algorithms include SHA2-256 and SHA.

It is possible to specify transforms via a Device Management Profile. The documentation claims support for the following:

Encryption algorithms: DES, 3DES, AES-128, AES-256, AES-128-GCM, AES-256-GCM, ChaCha20Poly1305

Integrity algorithms: SHA1-96, SHA1-160, SHA2-256, SHA2-384, SHA2-512

Diffie-Hellman groups: 1, 2, 5, 14, 15, 16, 17, 18, 19, 20, 21, 31

In common with the built-in Windows IKEv2 client, if one specifies a transform to be used (rather than just using the defaults), then only that one transform is sent (so one needs to be sure that the VPN server accepts it, since there are no alternative proposals).

Friday, 24 May 2019

Diagnosing VPN problems with Windows 10 VPN client

One approach to resolving a problem with a VPN set-up is to describe the problem as best as one can in a technical forum and hope that someone recognizes the situation and can make useful suggestions as to how to resolve the problem. Needless to say, this approach requires some luck and probably only works for simple problems.

Another approach is to “trace” the behaviour of the systems and either analyse the collected data or submit it for analysis by someone else. The network traffic and some of the events generated by Event Tracing for Windows providers are useful sources to trace.

Tracing network traffic

A straightforward network trace, performed on the client PC is a good first step. This could be performed with well-known tools (typically requiring installation) like Wireshark, Microsoft's Message Analyzer and Network Monitor or with built-in commands like “netsh trace” or the NetEventPacketCapture suite of PowerShell cmdlets (the well-known tools can also interpret and display captured network trace data).

One common class of problems is data sent by the VPN client that is never received by the VPN server. Ideally, one would perform a network trace on both the client and the server. It might be the case that fragments of the data reach the receiver but, perhaps because of a lossy network, a complete message is never received or it may be the case that the data is blocked/lost on route to the receiver. With just a network trace from one party, it is often possible to infer whether data is being blocked or lost but it is obviously better to compare the concurrent traces made by the two parties. If the problem can be easily reproduced, initially just tracing from the VPN client and only reverting to a concurrent client and server trace if the client-only trace is inconclusive is a pragmatic approach.

Tracing VPN protocol interactions

Most VPN protocols quickly switch from exchanging plaintext messages during initial connection set-up to exchanging encrypted messages. For the two IKE (Internet Key Exchange) based VPN protocols (L2TP/IPsec and IKEv2), tracing the Microsoft-Windows-RRAS, Microsoft-Windows-WFP and “IKEEXT Trace Provider” Event Tracing for Windows providers is most helpful.

In the format of a file suitable for use with the logman.exe “-pf” option, the following providers are a good starting point for the “Automatic” VPN type (where various VPN protocols are tried until one succeeds or they all fail):

"IKEEXT Trace Provider" 0xFFFFFFFF 255

Microsoft-Windows-RRAS

Microsoft-Windows-Ras-NdisWanPacketCapture

Microsoft-Windows-RasSstp

Microsoft-Windows-TCPIP

Microsoft-Windows-WFP

Microsoft-Windows-WebIO

All of these providers and a network trace as well can be started with a single command:

netsh trace start provider=Microsoft-Windows-RRAS provider=Microsoft-Windows-TCPIP provider=Microsoft-Windows-WFP provider=Microsoft-Windows-Ras-NdisWanPacketCapture provider=Microsoft-Windows-RasSstp provider=Microsoft-Windows-WebIO provider={106B464D-8043-46B1-8CB8-E92A0CD7A560} keywords=0xFFFFFFFFFFFFFFFF level=255 Ethernet.Type=(IPv4,IPv6,0) Wifi.Type=Data capture=yes report=disabled correlation=disabled overwrite=yes tracefile=vpn-prob.etl

Trace Provider Types

There are several types of trace providers, including:

· MOF (Managed Object Format)

· WPP (Windows software trace Pre-Processor)

· Manifest

· TraceLogging

For the MOF, Manifest and TraceLogging types there is normally enough information in the trace output coupled with information included with the Windows installation to present all of the captured data in a human readable format.

For the WPP type, a “.pdb” (Program DataBase, debug symbols) file containing “TMF” or “PUBLIC_TMF” annotations is needed to interpret the trace data. “TMF” annotations are normally only available in private symbol files (i.e. symbols files only available to the developer (e.g. Microsoft)).

IKEEXT Trace Provider

The IKEEXT trace provider is a WPP provider type and difficult to interpret with the private symbols for ikeext.dll. Nonetheless, it is the most useful provider when one needs to understand IKE behaviour (for the L2TP/IPsec and IKEv2 VPN types).

For VPN connections that use a pre-shared key for authentication, the key will probably be present (in plaintext) in the trace data.

Microsoft-Windows-WFP

This is a manifest based provider for the Windows Filtering Platform. Most (if not all) of the useful information recorded by this provider is also present in an IKEEXT trace, but this trace data is more easily readable.

Microsoft-Windows-RRAS

This manifest based provider delivers useful information for all of the VPN types, as one would expect from the Routing and Remote Access Server provider.

Microsoft-Windows-Ras-NdisWanPacketCapture

This manifest based provider shows the packets that flow through the VPN in their unencrypted form. It is well suited to examining the initial data exchanges through the VPN (including DHCP) and subsequently recording which traffic uses the VPN (useful if there are any routing problems).

Microsoft-Windows-TCPIP

This manifest provider generates a lot of low-level TCP/UDP/IP information, including routing table changes and routing decisions. Because it is a high-volume source of events, if the size of the generated trace file might become an issue (e.g. when sharing the trace or even just analysing the trace) then this provider should only be enabled if there are good grounds to believe that it will be useful when analysing the problem.

Microsoft-Pef-WFP-MessageProvider and Microsoft-Windows-NDIS-PacketCapture

These manifest based providers record the raw network packets on the local network. They can be useful for the very first (unencrypted) message exchanges during the establishment of a VPN connection and diagnosing IP fragmentation/reassembly issues but since these providers cannot be simply started and stopped (Microsoft-Windows-NDIS-PacketCapture, for example, requires rebinding of network stacks), one should have good grounds for using them (similar to the Microsoft-Windows-TCPIP provider).

Microsoft-Windows-RasSstp

This manifest based provider is only useful for an SSTP VPN connection and probably logs at most one event per connection attempt – a description of an error that occurred.

Microsoft-Windows-WebIO

This manifest based provider is only useful for an SSTP VPN connection and is probably the most security sensitive provider: it traces WinHttp (API) activity from all process and records the HTTP headers (Including authentication headers) – if passwords can be extracted from the headers (e.g. in the case of Basic authentication) then the passwords are compromised.

Other Providers

There are many other event providers that can be useful under particular circumstances (RAS, VPN, EAP, NPS, IAS, MPR, BFE, Firewall, Security, etc. related). Typically, one would only search for and select such providers once one has gathered evidence that they might be useful.

General security considerations

Event tracing of VPN connections will probably reveal most of the VPN configuration data, including IP addresses and VPN user name – but not VPN passwords (as far as I am aware, unless PAP (Password Authentication Protocol) is used) or certificate private keys. As already mentioned, pre-shared keys could be compromised.

Many manifest providers that potentially generate events containing Personally Identifiable Information (PII) flags such events with a keyword like PII_PRESENT (often having a value of 0x20000000000). Examining the manifest (metadata) of a provider will show the keywords that it supports.

There is no practical method to remove security sensitive data from a trace file whilst preserving the integrity of the trace data.

Other event tracing considerations

If the rate of event generation is too high, events can be lost (i.e. not captured); there are tracing parameters that can be adjusted to increase the level of buffering (and therefore reduce the level of event loss). One should also try to keep the trace period short – reproducing the problem as quickly as possible.

The files created by event tracing can be sparsely populated (to assist performant logging) and can often be substantially compressed (perhaps to less than a tenth of the original size) – which can be useful if trace files are shared.

An understanding of VPN protocols is very useful when examining a trace file and a very high degree of protocol and Windows experience is needed to interpret VPN relevant WPP trace providers.

There are many event trace providers that can provide insights into specific aspects of VPN behaviour. It is not uncommon to repeatedly trace a VPN problem, modifying the selection of trace providers (and perhaps also tracing from both client and server) in response to the analysis of earlier traces.

Update 2023

Since this blog entry was first published, a number of things have changed.

Microsoft Message Analyzer has been retired and its associated packet capture driver (Microsoft-Pef-WFP-MessageProvider) no longer passes driver code signing checks.

PktMon became known to me. PktMon can (any, by default, does) trace both the VPN protocol packets and the encapsulated data carried by the VPN protocol (the same data as captured by Microsoft-Windows-Ras-NdisWanPacketCapture). The PktMon event provider Microsoft-Windows-PktMon requires the pktmon.sys driver to be started and configured via IOCTLs, so Microsoft-Windows-Ras-NdisWanPacketCapture is still useful when a “pure” ETW trace controller (such as WPR or logman) is being used to control tracing.

A minor bug was introduced into IKEEXT.dll, which I reported thus:

An exception can occur in IkeGetUdpEncapIfDoingNatt during L2TP or IKEv2 VPN connection establishment. A variable that is not initialized via all code paths is used in the WPP logging code at the end of the routine.

An effective workaround is to limit the trace level to level 4 (no particularly useful information is lost); I now use "IKEEXT Trace Provider" with keywords 0x10 and level 4.

A new TraceLogging provider named Microsoft.Windows.Networking.Ikeext was introduced; this provider largely mirrors the same events as "IKEEXT Trace Provider" but in an easier to decode format. The GUID for this provider is consistent with the “ETW name-hashing algorithm”; for tools, such as WPR which understand and support the convention, this provider can be specified by preceding its name with a star/asterisk: *Microsoft.Windows.Networking.Ikeext

The IKEEXT trace providers "IKEEXT Trace Provider" and Microsoft.Windows.Networking.Ikeext can log plaintext versions of IKEv1 and IKEv2 packets ("IKEEXT Trace Provider" logs the full packet and Microsoft.Windows.Networking.Ikeext logs the first 511 bytes). The REG_DWORD registry value “TestFlags” under key “HKLM\SYSTEM\CurrentControlSet\Services\IKEEXT\Parameters” enables this logging when set to the value 0x20. The key is monitored by IKEEXT for changes, so there is no need to restart IKEEXT after modifying the value.

The packets are logged as hex strings; when decoded to binary and prepended with a dummy IP/UDP header, they can be viewed in tools such as Wireshark and Microsoft Message Analyzer (if you still have a working copy). Received IKE packets can be usefully displayed “as is” whereas sent IKE packets benefit from a few small “tweaks” before display (setting the Length field in the IKE header, reversing the bytes of the Message ID in the IKE header, setting the Payload length of the IKEv2 Encrypted payload to cover just the length of the initialisation vector, clearing the encryption flag in an IKEv1 header).

Sunday, 12 May 2019

Setting a pre-shared key for an L2TP over IPsec Incoming Connection

Windows 10 (i.e. a non-server version of Windows) can act as a VPN server. However not all of the configuration options available to a Routing and Remote Access Server are available via a built-in user interface – in particular the option to set a pre-shared key for incoming L2TP over IPsec connections.

The underlying functionality and API to set and use a pre-shared key is present and is simple to use, although the API documentation is rather brief. The main routine is MprAdminInterfaceSetCredentialsEx and this need only be preceded by a call to MprAdminServerConnect and (strictly speaking) eventually be followed by a call to MprAdminServerDisconnect.

The documentation on docs.microsoft.com for MprAdminInterfaceSetCredentialsEx only mentions a pre-shared key in the description of its third parameter (“dwLevel”): “A value of 1 indicates the information is a pre-shared key for the interface”. The Microsoft Open Specification document “[MS-RRASM]: Routing and Remote Access Server (RRAS) Management Protocol” contains additional (and useful) information. For example, in section “3.1.4.41 RRouterInterfaceSetCredentialsEx”, it mentions that: “If dwLevel is 0x0000002 and hInterface is NULL, the preshared key is used for L2TP”.

The documentation of the companion routine MprAdminInterfaceGetCredentialsEx mentions that: “A value of 1 indicates the information is a pre-shared key for the interface, which is in an encrypted format”. Empirically, the routine always seems to return the string “****************” (16 asterisks) – which corresponds with how a pre-shared key is displayed in the Routing and Remote Access Properties page Security tab.

The L2TP pre-shared key set by MprAdminInterfaceSetCredentialsEx is persisted in the registry as an LSA secret at HKLM\Security\Policy\Secrets\L$_RasServerCredentials#0.

A C# code extract demonstrating the process is shown below.

if (MprAdminServerConnect(null, out IntPtr mpr) == 0)
{
    string key = args[0];
    MPR_CREDENTIALSEX_1 creds = new MPR_CREDENTIALSEX_1 { Size = key.Length, CredentialsInfo = Marshal.StringToHGlobalAnsi(key) };
    MprAdminInterfaceSetCredentialsEx(mpr, IntPtr.Zero, 2, creds);
    MprAdminServerDisconnect(mpr);
}

In a “live” system, the L2TP pre-shared key is an element in a providerContext in the Windows Filtering Platform (WFP) Base Filtering Engine (BFE) configuration. More precisely, it is a presharedKeyAuthentication sub-element of the FWPM_IPSEC_IKE_MM_CONTEXT providerContext named "L2TP Main Mode Policy". The complete BFE configuration can be viewed with the command “netsh wfp show state”.

Setting a pre-shared key via the WFP API

It is possible to set a pre-shared key on a “temporary” basis (until the next reboot or the end of the WFP session) using the WFP API.

A recipe for doing this is to search for the FWPM_IPSEC_IKE_MM_CONTEXT providerContext named "L2TP Main Mode Policy" and then add a slightly modified copy, changing the providerContextKey to a new (unique) value, optionally changing the displayData to something appropriate and changing/extending the ikeMmPolicy to contain a presharedKeyAuthentication method (with the desired pre-shared key value). A new FWPM_LAYER_IKEEXT_V4 filter must also be added to utilise the new providerContext. A C# code extract demonstrating the process is shown below.

FWPM_SESSION0 session = new FWPM_SESSION0();
session.DisplayData.Name = "Gary";
session.Flags = FWPM_SESSION_FLAG_DYNAMIC;
if (FwpmEngineOpen0(null, RPC_C_AUTHN_DEFAULT, null, session, out IntPtr engine) == 0)
{
    var x = new FWPM_PROVIDER_CONTEXT_ENUM_TEMPLATE0 { ProviderContextType = FWPM_PROVIDER_CONTEXT_TYPE.FWPM_IPSEC_IKE_MM_CONTEXT };
    if (FwpmProviderContextCreateEnumHandle0(engine, x, out IntPtr h) == 0)
    {
        if (FwpmProviderContextEnum2(engine, h, 2, out IntPtr entries, out uint m) == 0)
        {
            for (uint i = 0; i < m; i++)
            {
                FWPM_PROVIDER_CONTEXT2 ctx = Marshal.PtrToStructure<FWPM_PROVIDER_CONTEXT2>(Marshal.ReadIntPtr(entries, (int)i * IntPtr.Size));
                Console.WriteLine(ctx.DisplayData.Name);
                if (ctx.DisplayData.Name == "L2TP Main Mode Policy")
                {
                    uint n = ctx.Policy.IkeMmPolicy->NumAuthenticationMethods;
                    IKEEXT_AUTHENTICATION_METHOD2* methods = stackalloc IKEEXT_AUTHENTICATION_METHOD2[(int)n + 1];
                    for (uint j = 0; j < n; j++) methods[j] = ctx.Policy.IkeMmPolicy->AuthenticationMethods[j];
                    string preshared = args[0];
                    methods[n].AuthenticationMethodType = IKEEXT_AUTHENTICATION_METHOD_TYPE.IKEEXT_PRESHARED_KEY;
                    methods[n].PresharedKeyAuthentication.PresharedKey.Size = (uint)preshared.Length;
                    methods[n].PresharedKeyAuthentication.PresharedKey.Data = (byte*)Marshal.StringToHGlobalAnsi(preshared);
                    ctx.ProviderContextKey = Guid.NewGuid();
                    ctx.DisplayData.Name = "Gary Context";
                    ctx.Policy.IkeMmPolicy->NumAuthenticationMethods = n + 1;
                    ctx.Policy.IkeMmPolicy->AuthenticationMethods = methods;
                    if (FwpmProviderContextAdd2(engine, ctx, null, out ulong _) == 0)
                    {
                        ulong weight = 1;
                        FWPM_FILTER0 filter = new FWPM_FILTER0();
                        filter.DisplayData.Name = "Gary Filter";
                        filter.Flags = FWPM_FILTER_FLAG_HAS_PROVIDER_CONTEXT;
                        filter.LayerKey = FWPM_LAYER_IKEEXT_V4;
                        filter.SubLayerKey = FWPM_SUBLAYER_UNIVERSAL;
                        filter.Weight.Type = FWP_DATA_TYPE.FWP_UINT64;
                        filter.Weight.uint64 = &weight;
                        filter.Action.Type = FWP_ACTION_TYPE.FWP_ACTION_PERMIT;
                        filter.U.ProviderContextKey = ctx.ProviderContextKey;
                        if (FwpmFilterAdd0(engine, filter, null, out ulong _) == 0)
                        {
                            Console.Write("Waiting..."); Console.ReadKey();
                        }
                    }
                }
            }
            FwpmFreeMemory0(&entries);
        }
        FwpmProviderContextDestroyEnumHandle0(engine, h);
    }
    FwpmEngineClose0(engine);
}

The only constraint on the filter subLayerKey and weight is that the filter should have a higher effective weight than the existing FWPM_LAYER_IKEEXT_V4 filter.

Simple trick to set a pre-shared key

It is straightforward to set a pre-shared key with MprAdminInterfaceSetCredentialsEx, but it does require a custom program. There is however a trick, using built-in tools, that seems to work: define a “dummy” IPsec tunnel with “Preshared key” authentication; any tool can be used to do this (e.g. the “Windows Defender Firewall with Advanced Security” control panel applet, the “netsh advfirewall consec add rule” command or the “New-NetIPsecRule” PowerShell cmdlet).

Here are examples of creating suitable “dummy” IPsec tunnels (the endpoint value can be any address that does not apply to any potential traffic) with a shared secret of “XXX”:

netsh advfirewall consec add rule name="Dummy" endpoint1=192.168.3.1/32 endpoint2=any mode=tunnel action=requireinrequireout auth1=computerpsk auth1psk="XXX"

New-NetIPsecRule -DisplayName "Dummy" -Phase1AuthSet (New-NetIPsecPhase1AuthSet -DisplayName "Dummy" -Proposal (New-NetIPsecAuthProposal -Machine -PreSharedKey "XXX")).InstanceID -InboundSecurity Require -OutboundSecurity Require -KeyModule IKEv1 -Mode Tunnel -LocalAddress 192.168.3.1/32 -RemoteAddress Any

The pre-shared key created by this method is persisted in the registry under HKLM\SYSTEM\CurrentControlSet\Services\SharedAccess\Parameters\FirewallPolicy\Phase1AuthenticationSets.

Similar to the L2TP pre-shared key, in a “live” system the pre-shared key created by this method is a presharedKeyAuthentication sub-element of an FWPM_IPSEC_IKE_MM_CONTEXT providerContext named after the Phase1AuthenticationSet ("Dummy", in the examples above). The filter that references this providerContext has a subLayerKey of FWPM_SUBLAYER_IPSEC_TUNNEL which gives the filter a higher effective weight than the L2TP filter.

On an “out-of-the-box” system, the ikeProposals in the L2TP providerContext and the new providerContext are usually identical; if testing (or a review of the “netsh wfp show state” output) indicates discrepancies then the commands to create the “Dummy” tunnel can be extended to compensate for the differences.