Monday 11 June 2018

Tracing HTTPS traffic on Microsoft Windows


Capturing HTTPS traffic is becoming an increasingly necessary troubleshooting technique (as HTTPS continues to replace plain HTTP), but is also becoming a more difficult undertaking. Assuming that one has control of the client end of the HTTPS channel, here are a few techniques that might be able to capture the traffic.

Network Sniffing

Because the network traffic is encrypted, a plain network trace will not show information about the HTTP protocol activity but it can nonetheless be interesting and/or useful.

Web browsers are keen users of experimental TCP mechanisms such as TCP Fast Open (TFO) and a network trace is useful for examining the initial TLS handshake steps – one can see which cipher suites are offered/accepted, the Server Name Indication (SNI) and Application-Layer Protocol Negotiation (ALPN) Client Hello extensions (if present) and the general shape/health of the TCP data flow.

Network Sniffing and Decryption (via Server Certificate Private Key)

The necessary ingredients for successfully capturing plaintext with this approach are: 
·         Access to the private key of the HTTPS server.

·         Ability to ensure that the client does not offer a cipher suite with ephemeral keys (“forward secrecy”).

·         Ability to ensure that TLS Session Resumption is not used.

·         Probability of capturing all relevant packets, otherwise the state information needed to generate Initialization Vectors (IV) for and verify message authentication codes (MAC) of subsequent TLS (Transport Layer Security) records may be lost.

The first condition can rarely be met; even if the authority responsible for the HTTPS server is willing, there may be technical obstacles to exporting the private key. The second condition is increasingly difficult to meet – HTTP/2 blacklists all cipher suites that do not use ephemeral keys.

Network Sniffing and Decryption (via Export Session Keys)

Some HTTPS clients offer the ability to export the TLS session keys (e.g. the SSLKEYLOGFILE setting for Chrome and Firefox browsers); this finesses the first 3 problems mentioned above. Some network trace analysis tools (such as Wireshark) can import the exported session keys and decrypt the captured data. The ability to use a network trace analysis tool is especially useful when HTTP/2 is in use because the binary encoding of HTTP/2 can easily be decoded and nicely presented by such tools.

Network Sniffing and Null Cipher Suite

The necessary ingredients for successfully capturing plaintext with this approach are: 

·         Ability to enable null cipher suites on the HTTPS server.

·         Ability to ensure that the client only offers null cipher suites.

The modifications to both server and client can be difficult (the null cipher suites are blacklisted by HTTP/2) and unless the problem being troubleshot is tied to HTTPS (such as token binding, TLS record encapsulation, etc.), it would be easier to just use a plain HTTP connection.

Debugging Proxy Server

Debugging Proxy Servers, such as Fiddler, are a common and general purpose method of capturing HTTPS traffic. FiddlerCore is included with Microsoft’s Message Analyzer and is the mechanism used when choosing the “Pre-encryption for HTTPS” scenario in that tool.

There are mechanisms that try to protect against “man-in-the-middle” interventions in HTTPS communications, such as “Public Key Pinning Extension for HTTP” (RFC 7469) and “Certificate Transparency” (RFC 6962). If an HTTPS client using these mechanisms cannot be configured to accept the proxy server certificate (hierarchy) then this technique cannot be used. There often is a way to configure additional certificates, since this is needed in the case of enterprises that mandate TLS interception proxies at their boundaries, but it needs to be found on a case by case basis.

Built-in Tracing in the Client

A major class of HTTPS clients, namely web browsers, often have built-in debugging and tracing facilities, intended for developers (Internet Explorer and Edge call them “(F12) Developer Tools”).

Unlike the previous techniques, these tools typically don’t provide a byte-by-byte record of HTTPS traffic because, for their typical audience, this information is too low-level – especially HTTP/2 binary encoded, framed and interleaved traffic.

Microsoft-Windows-WinINet and Microsoft-Windows-WinINet-Capture

WinINet (Windows Internet) is an API for accessing the Internet and it is used by both Edge and Internet Explorer, as well as many other applications. Two ETW (Event Tracing for Windows) providers give particular insight into the behaviour of the API: Microsoft-Windows-WinINet and Microsoft-Windows-WinINet-Capture.

Microsoft-Windows-WinINet-Capture is the simplest provider with just four events: the request/response headers/payloads. This “captures” all of the “data” exchanged, albeit that HTTP/2 data is mapped into an HTTP/1.1 style format (plain text rather than binary) and compressed content-encoding is expanded.

Microsoft-Windows-WinINet provides insight into the processing stages of an HTTP interaction and includes captured request/response headers and POST data. This provider also maps HTTP/2 binary encoded headers into HTTP/1.1 style plain text headers.

Microsoft-Windows-WebIO and Microsoft-Windows-WinHttp

WinHttp (Windows HTTP Services) is another API, similar to WinINet but intended for use in server/service scenarios. There are also two ETW providers associated with this API: Microsoft-Windows-WebIO and Microsoft-Windows-WinHttp.

Microsoft-Windows-WinHttp events are mostly related to proxy server discovery and use, and don’t give much insight into wider aspects of an HTTP interaction.

Microsoft-Windows-WebIO provides a similar level of detail to the WinINet provider. This provider mostly maps HTTP/2 binary encoded headers into HTTP/1.1 style plain text headers, but the sent headers are currently provided in some “intermediate” form (neither HTTP/2 binary encoded nor pure plain text).

Debugging of Schannel (Secure Channel) Interface

Intercepting the API calls that perform the encryption and decryption for TLS is another way of capturing the plain text of HTTPS communications. The WinINet, WinHttp and .NET Framework all use the Secure Channel (Schannel) security support provider via the Security Support Provider Interface (SSPI).

Tracing the input into EncryptMessage and the output from DecryptMessage captures all of the HTTPS content. One can also trace the input to and output from InitializeSecurityContext to capture the TLS connection establishment traffic.

.NET Framework .exe.config Tracing

The .NET Framework class library uses managed code to implement the HTTP/1.1 protocol and so its traffic is not observed by the ETW providers mentioned earlier (.NET Core does use the WinHttp API). There is however tracing built into the managed code implementation of HTTP that can be enabled and logged by appropriate settings in the application’s .config file.

Java Tracing

Java applications use Java implementations of the HTTP and TLS protocols. Like the .NET Framework, the Java implementation includes built-in debugging/tracing capabilities that can be enabled by setting the system property javax.net.debug.


Wednesday 6 June 2018

Network Sniffing on Microsoft Windows

There are a number of approaches to “tapping into” the network traffic of a Microsoft Windows (desktop/server) operating system. I would like to share some practical experience of using the various approaches.

 

NDIS Filter

 

Using an NDIS (Network Driver Interface Specification) filter driver is probably the most common technique – it is the technique used by Microsoft’s “Network Monitor” and one of the options for packet capture in its successor (Microsoft’s “Message Analyzer”). Typically, Wireshark (perhaps the best known third party sniffer used under Windows) also uses this technique (via use of the WinPcap or Npcap NDIS filter drivers).

 

An NDIS filter can observe and capture all of the activity at the data link layer (which can be divided into the logical link control (LLC) and medium access control (MAC) sublayers) – making it network (layer) protocol independent; it is the only technique that I shall mention which has this capability. If one wishes to capture MAC frame headers or network protocols other than IPv4/IPv6, then this is the technique to use.

 

There are at least two problems with the NDIS filter approach:

 

·         Traffic over loopback interfaces cannot be captured (since these “software” network interfaces do not use NDIS).

·         Network traffic can be interrupted when starting and stopping a capture session, since the NDIS filter needs to be “bound” into the driver stack. The “binding” process “pauses” traffic through the stack and (very occasionally) this “pause” can continue for an extended period of time – potentially causing network connections to be closed.

 

On one occasion (or possibly two – it is the first that remains in memory), I started a network trace on a production server whilst logged in via Remote Desktop (RDP) and the RDP connection was immediately broken and could not re-established for several minutes (almost certainly due to a problem draining packets from the driver stack during a pause/bind operation).

 

Recent versions of Windows include such an NDIS filter driver – the driver/service NdisCap. This driver exposes the captured traffic via the Event Tracing for Windows (ETW) mechanism as the provider “Microsoft-Windows-NDIS-PacketCapture”. A limited filtering capability is also exposed via this ETW provider.

 

The “Microsoft-Windows-NDIS-PacketCapture” provider is used by Message Analyzer, the “netsh trace” command and the “NetEventPacketCapture” PowerShell cmdlets (in particular, the “Add-NetEventPacketCaptureProvider” cmdlet).

 

WFP Callouts

 

The Windows Filtering Platform (WFP) allows developers to “intervene” at several stages in the processing that takes place as a packet flows through the IPv4/IPv6 network stack – including the capability of capturing the network traffic (from the network layer upwards).

 

MAC frame headers and network protocols other than IPv4/IPv6 are not included in the captured data, but packets sent via loopback can be captured; IPsec traffic in its unencrypted state (i.e. before/after encryption/decryption) can be captured too.

 

Adding and removing WFP callouts does not “pause” the network stack and is less likely to cause a network interruption than binding/unbinding an NDIS filter driver (reconfiguring WFP is a normal/common activity).

 

Recent versions of Windows include such a WFP callout driver – the driver/service WfpCapture. This driver exposes the captured traffic via the Event Tracing for Windows (ETW) mechanism as the provider “Microsoft-Pef-WFP-MessageProvider”. A limited filtering capability is also exposed via this ETW provider.

 

The “Microsoft-Pef-WFP-MessageProvider” provider is used by Message Analyzer and the “NetEventPacketCapture” PowerShell cmdlets (in particular, the “Add-NetEventWFPCaptureProvider” cmdlet).

 

At the time of writing, the current version of WfpCapture does not pass the Driver Signing Policy enforced by Windows 10, version 1607 and later. Unless one or more of the exception conditions apply (i.e. Secure Boot is disabled or the installed Windows version was upgraded from an earlier release of Windows (rather than being a “clean” install)), then this WFP callout driver cannot be used. This is the most recent message from Microsoft that I could find on this topic:

 

Yes, we were able to repro with SecureBoot enabled.  We are looking at this now and post a new build when we have this fixed.  But I don't have a time frame. 

 

Paul

 

Paul E Long Microsoft (MSFT)                                     Thursday, October 13, 2016 1:08 PM

 

The changes to the Driver Signing Policy were discussed at a 2016 Filter Plugfest (video available on Channel 9) and Scott Anderson from Microsoft mentioned four exceptions to the policy – the three currently documented exceptions and a “test reg key to allow cross-signed certificates to work”. Peter Viscarola (founder of OSR) later wrote, in response to a discussion of this topic:

 

I hate to say this, but since you asked: The registry key information is only available under NDA.

 

The “upgraded” system exception to the driver signing policy is signalled by a registry value in the Code Integrity (CI) Policy key. The driver signing policy is slowly being tightened; future releases will first only allow exceptions for “boot start” drivers before finally removing all exceptions.

 

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\CI\Policy

    BootUpgradedSystem    REG_DWORD    0x1

    UpgradedSystem        REG_DWORD    0x1

 

Raw Sockets

 

The “functionality” of raw sockets under Windows (i.e. which packets they are capable of receiving, such as outbound packets) has changed over the years. Windows 10 raw sockets can receive all IPv4 packets (both inbound and outbound) including their IPv4 headers and all IPv6 packets – but only from the transport layer upwards (i.e. excluding their IPv6 headers). The receipt of inbound packets is subject to the Windows Defender Firewall rules in force – it is normally necessary to add a rule to grant access.

 

Since raw sockets are built into the kernel TCP/IP implementation, there is no need for additional kernel-mode code (such as NDIS filter drivers or WFP callout drivers). There are however a number of drawbacks compared to the first two techniques:

 

·         No filtering in kernel-mode is possible – all packets are delivered to the user-mode application (which has performance implications).

·         There is no visibility of how many packets are lost/dropped as a result of insufficient buffering.

·         The packets are first time-stamped when processed by a user-mode application, which might be some time after they “could have been” time-stamped by filter/callout driver kernel-mode code running in a DPC (Deferred Procedure Call).

·         There is no guarantee of the order in which the kernel adds packets to the raw socket. Monitoring the kernel activity with the “Microsoft-Windows-TCPIP” and “Microsoft-Windows-Winsock-AFD” providers indicates that the outbound response to an inbound packet is often copied to the raw socket before the inbound packet.

 

Using multiple outstanding read requests and I/O Completion Ports reduces the risk of dropping packets but further increases the risk of out-of-order time-stamping of packets (because the I/O completion port thread pool scheduling determines how quickly a time-stamp can be associated with a packet).

 

If captured data is loaded into Message Analyzer for analysis, the out-of-order time-stamping causes many spurious diagnosis messages. A “premature” packet is flagged with diagnosis messages like:

 

Lost TCP segments, sequence range 1234 ~ 2345.
This data segment was acknowledged before it arrived, which infers an out-of-order capturing issue.

 

The corresponding “delayed” packet is flagged with diagnosis messages like:

 

Retransmitted, original message is missing.

 

One always has to be aware that artefacts of the capture process can misrepresent what actually happened “on the wire” (overly aggressive capture filtering being perhaps the biggest problem), but it is nonetheless unfortunate that the value of the automated diagnosis is substantially reduced when using this capture technique.

 

The biggest problem with raw socket network sniffing is the handling of IPv6 packets. The documentation (accurately) states:

 

For IPv6 (address family of AF_INET6), an application receives everything after the last IPv6 header in each received datagram regardless of the IPV6_HDRINCL socket option. The application does not receive any IPv6 headers using a raw socket.

 

The basic IPv6 header (RFC 8200), and therefore the missing information in the received data, looks like this:

 

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |Version| Traffic Class |           Flow Label                  |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |         Payload Length        |  Next Header  |   Hop Limit   |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                                                               |

   +                                                               +

   |                                                               |

   +                         Source Address                        +

   |                                                               |

   +                                                               +

   |                                                               |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                                                               |

   +                                                               +

   |                                                               |

   +                      Destination Address                      +

   |                                                               |

   +                                                               +

   |                                                               |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 

The Version field can be inferred since one needs to create separate raw sockets per network interface for IPv4 and IPv6 packets. The Payload Length is implicit in the length of the captured data. The Source Address can be obtained by using socket functions such as recvfrom/WSARecvFrom/WSARecvMsg (which can return the source address via a separate output parameter). Traffic Class, Flow Label and Hop Limit are often not “interesting” in common troubleshooting scenarios involving network sniffing.

 

The most important missing information is the final Next Header value since this determines the transport protocol and how the captured data should be interpreted. The Internet Assigned Numbers Authority (IANA) documents the registered values for this field; not all of these values are acceptable as the final Next Header value (e.g. HOPOPT and AH) and some make the interpretation/decoding of subsequent data “difficult” (e.g. ESP). The Next Header values that I find most useful to identify are TCP, UDP and ICMPv6 and one can use heuristics to infer which, if any, of these values was probably present.

 

The basic structure of the UDP, ICMPv6 and TCP headers is shown here (taken directly from the plain text versions of the RFCs):

 

UDP

 

   +--------+--------+--------+--------+

   |     Source      |   Destination   |

   |      Port       |      Port       |

   +--------+--------+--------+--------+

   |                 |                 |

   |     Length      |    Checksum     |

   +--------+--------+--------+--------+

 

ICMPv6

 

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |     Type      |     Code      |          Checksum             |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 

TCP

 

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |          Source Port          |       Destination Port        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                        Sequence Number                        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                    Acknowledgment Number                      |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |  Data |           |U|A|P|R|S|F|                               |

   | Offset| Reserved  |R|C|S|S|Y|I|            Window             |

   |       |           |G|K|H|T|N|N|                               |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |           Checksum            |         Urgent Pointer        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 

The UDP header is the only header that contains a field (Length) that can be directly compared with information that we know about the received packet. All three types of headers include a Checksum field, albeit at different offsets.

 

The heuristics that I use to infer the Next Header value are:

 

·         If the received data length matches the UDP Length and the UDP Checksum is good, then set Next Header to UDP.

·         If the TCP Checksum is good, then set Next Header to TCP.

·         If the ICMPv6 Checksum is good, then set Next Header to ICMPv6.

·         If the received data length matches the UDP Length, then set Next Header to UDP.

·         If the first 4 bits of the received data equals 4 and the IPv4 checksum is good, then set Next Header to IPv4 (IPv4 packet encapsulated in IPv6).

·         If the first 4 bits of the received data equals 6 and the IPv6 length is consistent with the length of the received data, then set Next Header to IPv6 (IPv6 packet encapsulated in IPv6).

·         If the first byte of the received data equals IPPROTO_UDP (17) and the second byte is zero, then set Next Header to IPv6FragmentHeader.

·         Otherwise, set the Next Header to Reserved (255/0xFF). These packets are then easy to spot in trace analysis tools such as Message Analyzer and Wireshark.

 

If a checksum is good, repeating the checksum process including the checksum value itself in the checksum should deliver a result of 0 or 0xFFFF. In addition to the transport data, the checksum also covers an IPv6 pseudo-header:

 

IPv6 pseudo-header

 

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                                                               |

   +                                                               +

   |                                                               |

   +                         Source Address                        +

   |                                                               |

   +                                                               +

   |                                                               |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                                                               |

   +                                                               +

   |                                                               |

   +                      Destination Address                      +

   |                                                               |

   +                                                               +

   |                                                               |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                   Upper-Layer Packet Length                   |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                      zero                     |  Next Header  |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 

We know the Upper-Layer Packet Length and the Source Address and we are guessing the Next Header value, but we are still missing the Destination Address. The Destination Address is available if WSARecvMsg is used to receive messages from the raw socket (via the Control field of a WSAMSG struct (WSACMSGHDR cmsg_type = IPV6_PKTINFO)). An alternative approach is to create an initial set of possible addresses by examining various networking tables: the TCP connections table, the destination cache table, the neighbours table and the local addresses of all network interfaces; all received Source Addresses are also merged into the set. Now try to verify the checksum using each of these addresses.

 

Because a “partial” checksum of the received data and known values from the pseudo-header can be calculated once (and partial checksums for each of the possible Destination Addresses can be cached), verifying the complete checksum just involves adding two values and folding back in any carry – which can be done very quickly.

 

False matches (of Next Header and Destination Address against the Checksum) are possible, but I have been happy with the results.

 

PktMon

 

PktMon is a relatively new packet capture technique. Microsoft has introduced hooks into NDIS.sys to support this type of logging. Some typical stack traces at the point that captured data is passed to the PktMon.sys driver show how and where the hooks are integrated into NDIS.sys:

 

PktMon!PktMonPacketLogCallback+0x19

ndis!PktMonClientNblLog+0xbd

ndis!PktMonClientNblLogNdis+0x2b

ndis!ndisCallSendHandler+0x3ca4b

ndis!ndisInvokeNextSendHandler+0x10e

ndis!NdisSendNetBufferLists+0x17d

 

PktMon!PktMonPacketLogCallback+0x19

ndis!PktMonClientNblLog+0xbd

ndis!PktMonClientNblLogNdis+0x2b

ndis!ndisMIndicateNetBufferListsToOpen+0x3e95c

ndis!ndisMTopReceiveNetBufferLists+0x1bd

ndis!ndisCallReceiveHandler+0x61

ndis!ndisInvokeNextReceiveHandler+0x1df

ndis!ndisFilterIndicateReceiveNetBufferLists+0x3be91

ndis!NdisFIndicateReceiveNetBufferLists+0x6e

 

This technique allows the data to be captured at many points in the NDIS protocol stack (the same packet can be captured and recorded at more than one point in the stack), but simple configuration allows packets to be captured just once.

 

Enabling and disabling tracing does not involve rebinding the NDIS protocol stack, which is an improvement over the NDIS filter approach to tracing.

 

This capture technique does not capture “loopback” traffic (for the same reasons that NDIS filters are unable to capture such traffic).

 

Unlike Microsoft’s NdisCap NDIS filter driver and Microsoft-Windows-NDIS-PacketCapture ETW provider, the ETW provider associated with PktMon (Microsoft-Windows-PktMon) does record the original payload size of packets that are truncated. NdisCap captures large TCP sends with an IP “pseudo” header containing an IP length of zero; since there is no record of the original payload size and the size cannot be deduced from the pseudo IP header then analysis tools (such as Wireshark) are unable to determine whether IP packets are missing from the captured data.

 

PktMon provides better filtering options than those supported by NdisCap/Microsoft-Windows-NDIS-PacketCapture, but the filters are not set via ETW but rather by IOCTLs to the PktMon driver.