I have previously written about the difficulties of implementing a Cisco AnyConnect VPN client using just the Extensible Authentication Protocol (EAP) framework and interfaces in Windows 10: most of what needs to be done to enable the establishment of VPN connections to an AnyConnect server can be implemented as an EAP authentication mechanism, but some requirements cannot be fulfilled within this framework. These requirements are: use of IKEv2 Vendor ID payloads, control of the IDi Identification payload and access to IKEv2 messages and derived keys for the AnyConnect authentication computations.
The first two messages in the establishment of an IKEv2 security association are sent in plain text, so one can see their content with a simple network sniffer, but subsequent messages are encrypted and if the “protocol” is not fully documented (e.g. the AnyConnect EAP mechanism) then one needs to somehow see these encrypted messages in plain text. There are several ways to do this, but if one has access to an original client and server then one way is particularly useful – develop server and client replacements in “steps”. By “steps” I mean implement the known steps of the protocol and then use the original client/server in conjunction with the new server/client to capture the next unknown packet in plain text.
For example, at the start of the process one does not know what the client will send in its second message (its first encrypted message), so one can use the original client to connect to the new server (that, at first, only knows how to handle the first, unencrypted, message exchange). After seeing what the original client sends, one can use the new client to send an equivalent message to an original server and see what it returns in plain text. With the exception of the detailed cryptographic computations used in the authentication process, one can derive an understanding of the full protocol exchanges using this technique.
At the end of the process, one has a full keying module for the protocol but, having served its purpose as a way of understanding the protocol, it is of little practical use without two other components: an implementation of the Encapsulating Security Payload (ESP) protocol and a mechanism for creating the operating system “objects” (such as network interfaces) that can be assigned IP addresses, appear in routing tables, etc..
It is perhaps not widely known, but the Windows Filtering Platform (WFP) Base Filtering Engine (BFE) contains all of the functionality necessary for the first the first task (using ESP to transport data) and is documented and fully supported. Two documentation entries in Using Windows Filtering Platform are particularly useful: Using Tunnel Mode and Manual SA Keying.
At this stage, it suffices to say that the first task can be implemented using documented APIs and the C# programming language (with quite a large investment of time in creating C# declarations of the WFP data structures). But before expanding on this topic, it would be reassuring to know that the second task (creating a “network interface”) can also be achieved. There appears to be no documented API for this, but the Microsoft IKEv2 implementation (part of IKEEXT) uses the DLL “vpnikeapi”. The DLL exports of vpnikeapi look very promising:
This looks just like the set of routines that would be needed to integrate IKEv2 into an operating system as both a client and a server. The IKEv2 payloads Configuration, Identification, Traffic Selector and EAP are just those that have relevance to the operating system, “ProcessAdditionalAddressNotification” sounds as though it supports MOBIKE and there are routines to create and close a “tunnel”.
Unfortunately, there is one preparatory step that must be performed before using these routines and the implications of that step make the vpnikeapi DLL unsuitable for my purposes. Why that is the case will become apparent later.
Flow of control and data in the Windows IKEv2 VPN implementation
The RasDial routine is the entry point for the flow of control for the establishment of a VPN connection and %APPDATA%\Microsoft\Network\Connections\Pbk\rasphone.pbk is the source of the configuration data for the connection. RasDial communicates with the “RasMan” (Remote Access Connection Manager) service via RPC and most of the operating system specific “action” takes place in that service; RasMan communicates with the IKEEXT (IKE and AuthIP IPsec Keying Modules) service for IKE/IKEv2/AuthIP protocol specific functionality.
When the IKEEXT service starts up, it uses the undocumented WFP routine IPsecKeyModuleAdd0 to register itself as the keying module provider for IKE, IKEv2 and AuthIP; the registration includes call-back routines to acquire and expire Security Associations. When RasMan needs the services of a registered keying module, it uses the undocumented routine IPsecSaInitiateAsync0 to initiate the behaviour (by invoking one of the registered call-back routines) and the keying module uses the routine IPsecKeyModuleUpdateAcquire0 to feedback progress. IPsecKeyModuleDelete0 rounds out the small set of keying module routines.
Some information is passed to the keying module in the parameters of the call-back routine, but this information is largely references to information stored dynamically in WFP in the form of “provider contexts”. These “provider contexts” are created from information specified in the mainModeTunnelPolicy, tunnelPolicy and keyModKey arguments to the FwpmIPsecTunnelAdd2 routine:
The routines exported by vpnikeapi (but implemented by vpnike.dll in the RasMan service) are only available for use when a VPN establishment has been initiated by RasDial or similar. rasphone.pbk does not directly define the keying module GUID to be used (instead, it indicates (via VpnStrategy) which type of VPN is preferred and RasMan chooses the appropriate keying module GUID), so while it would be easy to register a parallel/alternative IKEv2 keying module, it would be difficult to route any work to it. To make the situation even clearer, vpnike contains an explicit test that its client is IKEEXT (via the IKEEXT service SID), so even if one did register a new keying module and managed to persuade RasMan to use it and initialize the data structures needed for vpnikeapi use, the new keying module would fail the security tests performed by the vpnike DLL.
Short aside: VPN Connection IPsec Configuration
By default, the Windows IKEv2 VPN client makes 18 Main Mode Security Association (SA) Proposals and 6 Quick Mode SA proposals. This default list of proposals can be strengthened and shortened by registry entries (such as NegotiateDH2048_AES256) or set (and limited) to one proposal by use of the Set-VpnConnectionIPsecConfiguration PowerShell cmdlet.
The flexibility of SA Proposals and Transform in IKEv2 means that often two proposals (one proposal with GCM/CCM mode cipher algorithms and one with CBC mode cipher algorithms) are adequate. However the WFP data structures used to communicate this information between RasMan and IKEEXT were probably designed before IKEv2 was standardized and do not support its flexibility – they are better suited to the IKE proposal model.
The format of proposals stored in rasphone.pbk exhibits the same problem (one transform of each type per proposal). They are stored as a sequence of serialized ROUTER_CUSTOM_IKEv2_POLICY_0 structures with the name “CustomIPSecPolicies”; “NumCustomPolicy” records how many proposals there are. Set-VpnConnectionIPsecConfiguration always creates a single custom IPsec policy, but manual editing of rasphone.pbk can be used to add more.
How to create a Network Interface?
This picture taken from the RAS Architecture Overview shows the search area for a solution to the problem of creating a network interface to “front up” (present an interface to the operating system) of the ESP secured (VPN) connection that I hoped to establish:
The Windows 10 IKEv2 client is sometimes known as the “Agile VPN” client and there is an agilevpn.sys WAN Miniport Driver which will probably be needed by any solution built from existing components. The diagram shows two potential user-mode interfaces that might provide the required functionality: RAS and TAPI (the Telephony API). However, I could not find any set of documented RAS or TAPI routines that could make any effective contribution to solving the problem (the routines that RasMan uses are not exported from any DLL (they are internal to vpnike.dll, rasmans.dll, etc.)).
The next step was to assume that the interface between the user-mode abstractions and the kernel-mode functionality was implemented with Device I/O Controls (IOCTLs). Tracing control and data flows across this boundary during VPN connection establishment suggested that IOCTLs to four different device drivers would be necessary:
This turned out to be almost correct, but one thing further was needed to complete the task: a call to the undocumented routine “NsiSetAllParameters”.
The security descriptors on the NDProxy and AgileVPN devices permit only access by SYSTEM; NdisWan grants access to SYSTEM and NETWORK SERVICE and WANARP grants some access to “Authenticated Users”.
NDProxy
As suggested by the architecture overview diagram, and in practice, NDProxy is the first point of contact.
After opening the device, one can issue a “connect” IOCTL, which returns the number of devices (numbered from zero to N - 1) which can be used via NDProxy; this is typically 5: the WAN Miniports SSTP, IKEv2, L2TP, PPTP and PPPOE. One can then iterate over these miniports, looking for one with a NDIS_WAN_MEDIUM_SUBTYPE of NdisWanMediumAgileVPN (i.e. IKEv2).
Having identified and opened the miniport, the next step is to make a “call” on it via a “query info” IOCTL with OID_TAPI_MAKE_CALL. The type of information needed to make the call includes the IP address of the VPN server, the network interface index via which the VPN server is reachable (the GetBestInterfaceEx API routine for the VPN server address returns the needed value) and the “tunnel ID”.
The “tunnel ID” is a 64 bit number that can be freely chosen (it should uniquely identify the tunnel, but normally there will be only one tunnel). It is used to link the network interface to the BFE SA Contexts (using the IPSEC_VIRTUAL_IF_TUNNEL_INFO0 structure).
In general, TAPI call establishment can take some time and RasMan uses asynchronous I/O to perform the call. Synchronous I/O works too and is OK for test purposes. Even with synchronous I/O, the call is not necessarily established when the I/O completes. There is an event reporting mechanism (line events) which report changes in call state, but polling the call state (waiting for LINECALLSTATE CONNECTED) works too.
Once the call has been established, an IOCTL with OID_TAPI_GET_ID can be used to retrieve the (connection) ID which is needed for a later step. This completes the initial interaction with NDProxy; nothing more needs to be done until the VPN connection is terminated, when one should drop (OID_TAPI_DROP) and close (OID_TAPI_CLOSE_CALL) the call and close the device (OID_TAPI_CLOSE).
AgileVPN
The call initiated via NDProxy triggers call set-up in AgileVPN, so AgileVPN is then ready for additional IKEv2 specific set-up – the first of which is informing AgileVPN of the IKEv2 Traffic Selectors.
The second, and final, IOCTL directly to AgileVPN is the command to create the VPN tunnel.
WANARP
A new network interface is still not visible at this point. One first has to issue an IOCTL to WANARP to obtain a network interface LUID for an interface of type IF_TYPE_PPP.
When the VPN connection is terminated and the network interface taken down, the LUID allocation should be freed (via another IOCTL).
NdisWan
Before performing the “activate route” IOCTL on NdisWan, it is necessary to choose a GUID for the network interface and associate this with a network compartment – this can be done via the undocumented NsiSetAllParameters routine.
The first IOCTL to NdisWan is used to map the connection ID obtained from NDProxy to a “bundle handle”. Once one has the bundle handle, one can activate the network interface and route to the network interface. In addition to the bundle handle, the IOCTL that activates the route requires the tunnel ID, the LUID obtained from WANARP, a name for the network interface, the GUID for the network interface and the IP address assigned by the VPN server to this client.
When the VPN connection is terminated and the network interface taken down, the route should be deactivated (via another IOCTL).
Finishing touches
Despite providing the assigned IP address to NdisWan, the network interface comes up with a link-local IP address (169.254.X.X). The correct address can be assigned via a call to the IP Helper API routine CreateUnicastIpAddressEntry.
The network interface comes up with a few routes preassigned (the VPN server, broadcast and multicast addresses), but it is useful to manually add either a route to the network reachable via the VPN or a default route via the VPN. This can be done with the IP Helper API routine CreateIpForwardEntry2.
Windows Filtering Platform / Base Filtering Engine
The FwpmIPsecTunnelAdd routine is just a convenience – its functionality can be mimicked by several calls to other WFP routines. For the purposes of this application, most of its parameters can be null/zero; only the engineHandle, flags and tunnelPolicy need specific values: a valid engineHandle, a flags value of FWPM_TUNNEL_FLAG_ENABLE_VIRTUAL_IF_TUNNELING and a dummy IKEv2 Quick Mode Tunnel provider context. The provider context must include (at least) one IPsec proposal – although this is not used (it need not correspond to any actually offered or negotiated proposal).
Only a few other calls to WFP routines are needed. Calls to IPsecSaContextCreate1 and IPsecSaContextGetSpi1 create a Security Association (SA) context and retrieve the SPI (Security Parameter Index). Once keying material has been negotiated/derived, calls to IPsecSaContextAddInbound1 and IPsecSaContextAddOutbound1 make this information available for use by ESP (Encapsulating Security Payload).
Implementation statistics
The entire implementation of a “working” VPN is less than 7000 lines of C#, of which around 1700 lines are just definitions of WFP structures and over 300 lines are just definitions of Diffie-Hellman primes and curves. The IKEv2 implementation is very “bare bones”, with plenty of missing functionality (such as rekeying).