Saturday, 12 June 2021

Mapped network drive reconnection failures

 As a regular reader of forums discussing technical problems with Windows components, I have been interested in the number of problems reported with connections to SMB file shares. I did not have the problem myself and I could not think of a way to reproduce and troubleshoot the problem(s).

There is a Microsoft article entitled “Mapped network drive may fail to reconnect in Windows 10, version 1809” which says:

Microsoft is working on a resolution and estimates a solution will be available by the end of November 2018. Monitor the mapped drive topic in the Windows 10 1809 Update History KB 4464619.

The referenced KB article does not contain any relevant information about progress on the problem.

It may seem irrelevant as this stage, but there was a discussion on the answers.microsoft.com forum in 2020 about the “UseOptions” value in the registry key HKCU\Network\<DRIVELETTER> - it seemed to be causing problems with persistent connections and the value seemed to have been introduced in Windows 10, version 2004.

The authoritative source of information about SMB is the Microsoft specification "[MS-SMB2]: Server Message Block (SMB) Protocol Versions 2 and 3".

Section "3.2.4.2.2 Negotiating the Protocol" of this document says:

When a new connection is established, the client MUST negotiate capabilities with the server. The client MAY<111> use either of two possible methods for negotiation.

The first is a multi-protocol negotiation that involves sending an SMB message to negotiate the use of SMB2. If the server does not implement the SMB 2 Protocol, this method allows the negotiation to fall back to older SMB dialects, as specified in [MS-SMB].

The second method is to send an SMB2-Only negotiate message. This method will result in successful negotiation only for servers that implement the SMB 2 Protocol.

The reference <111> says:

The Windows-based client will initiate a multi-protocol negotiation unless it has previously negotiated with this server and the negotiated server's DialectRevision is equal to 0x0202, 0x0210, 0x0300, 0x0302, or 0x0311. In the latter case, it will initiate an SMB2-Only negotiate.

It seems that older SMB servers (NAS devices and Windows Server 2003) don’t expect a new connection to start with an SMB2-Only negotiate. They can behave in various incorrect ways, such as returning an error message, not responding at all, breaking the connection, etc., and this results in different error messages being shown to the user.

It is often mentioned that there are anomalies when referring to the file share by server name or server IP address – this is caused by a dependency on which version of the path to the share has “remembered” SMB2 capabilities.

There are many reasons why it might be necessary to “reconnect” to a file share: the transport connections have an “idle timeout”, the client may move between different networks or many other types of network interruption may cause the connection to a file share to require reconnection. This means that problems can occur at unpredictable times.

Another reason to “reconnect” a share is to restore persistent file shares when a user logs in. The change in Windows 10, version 2004 seems to have been to “persist” the knowledge of the server capabilities to the registry (rather than just in-memory data structures in mrxsmb.sys). When persisting a share, Windows now queries the attributes of the share with a call to NtQueryInformationFile, with a FILE_INFORMATION_CLASS of FileRemoteProtocolInformation and stores this information in the UseOptions value of the key HKCU\Network\<DRIVELETTER>.

The information returned by NtQueryInformationFile is a FILE_REMOTE_PROTOCOL_INFORMATION structure and the ProtocolMajorVersion member contains the negotiated SMB major version number. This enables Windows to decide whether it can use SMB2-Only negotiation.

struct FILE_REMOTE_PROTOCOL_INFORMATION
{
    USHORT StructureVersion;     // 1 for Win7, 2 for Win8 SMB3, 3 for Blue SMB3, 4 for RS5
    USHORT StructureSize;           // sizeof(FILE_REMOTE_PROTOCOL_INFORMATION)
    ULONG  Protocol;                    // Protocol (WNNC_NET_*) defined in winnetwk.h or ntifs.h.
    USHORT ProtocolMajorVersion;
    USHORT ProtocolMinorVersion;
    USHORT ProtocolRevision;
    USHORT Reserved;
    ULONG  Flags;
    struct {
        ULONG Reserved[8];
    } GenericReserved;
    union {
        struct {
            struct {
                ULONG Capabilities;
            } Server;
            struct {
                ULONG Capabilities;
                ULONG CachingFlags;
                UCHAR ShareType;
                UCHAR Reserved0[3];
                ULONG Reserved1;
            } Share;
        } Smb2;
        ULONG Reserved[16];
    } ProtocolSpecific;
}

Registry storage of remembered mapped network drives

The “remembered” mapped network drives are stored in the registry under the key HKCU\Network. For each “drive letter” subkey under this key, the following information can be stored:

ConnectFlags: a REG_DWORD value containing a bit mask of values constructed from some of the CONNECT_* definitions in winnetwk.h; in particular CONNECT_REQUIRE_INTEGRITY, CONNECT_REQUIRE_PRIVACY and CONNECT_WRITE_THROUGH_SEMANTICS.

ConnectionType: a REG_DWORD value containing a value taken from the RESOURCETYPE_* definitions in winnetwk.h; in particular RESOURCETYPE_DISK.

DeferFlags: a REG_DWORD value indicating whether interaction with the user is needed to restore the connection (e.g. to obtain a password); the value 1 means that a password from the user is needed, a value of 2 means that a password from the user might be needed and a value of 4 means that default/stored credentials can be used (i.e. no need to ask the user for a password).

ProviderFlags: a REG_DWORD value representation of a Boolean value (0/1), indicating whether the RemotePath refers to a DFS root. If the RemotePath is not a DFS root, this value is normally omitted.

ProviderName: a REG_SZ value containing the provider name; in particular “Microsoft Windows Network”.

ProviderType: a REG_DWORD value containing a value taken from the WNNC_NET_* definitions in wnnc.h; in particular WNNC_NET_SMB.

RemotePath: a REG_SZ value containing the UNC path of the mapped network drive.

UseOptions: a REG_BINARY value containing a sequence of Tag/Length/Value elements. The only “Tag” that I have observed is “DefC”, the value of which is a FILE_REMOTE_PROTOCOL_INFORMATION structure.

UserName: when needed, a REG_SZ value containing the username; when not needed, a REG_DWORD value containing 0.

Setting ProviderFlags as a partial workaround

Many reports can be found in the Internet that setting ProviderFlags to 1 for a remembered mapped network drive can help and this appears to be true. When ProviderFlags is set to 1, indicating that the RemotePath refers to a DFS root, more DFS operations take place. The DFS driver initially rewrites the RemotePath, replacing the share name with “IPC$” and then asks the SMB driver to connect to this path so that a FSCTL_DFS_GET_REFERRALS request can be sent to the server – the “remembered” SMB capabilities of the server are not made available to the SMB driver for this call, so the SMB driver performs a “multi-protocol negotiation”. The FSCTL_DFS_GET_REFERRALS request fails with STATUS_FS_DRIVER_REQUIRED (if the RemotePath is not a DFS root) and the SMB connection process continues – but, by now, the protocol has been negotiated (via multi-protocol negotiation) and the network drive is successfully mapped.

Most of the registry values for remembered mapped network drives are updated when used – except the ProviderFlags value: it is only created/set if the RemotePath is a DFS root. This allows misleading “workaround” information (ProviderFlags = 1) to persist in the registry.

Invisible references to an SMB server

Unfortunately, deleting all references to a SMB server from user mode is not guaranteed to remove all recollection of the server from mrxsmb.sys. Under these circumstances, attempts to unload mrxsmb.sys also fail/hang (such unload attempts normally succeed). The unload attempt gets stuck here:

nt!KeWaitForSingleObject+0x233
rdbss!RxSpinDownOutstandingAsynchronousRequests+0x9d
rdbss!RxUnregisterMinirdr+0x1fc
mrxsmb!MRxSmbInitUnwind+0x123
mrxsmb!MRxSmbUnload+0x4e
nt!IopLoadUnloadDriver+0xdc065

The WPP ETW provider for mrxsmb.sys allows the reference count for the SrvEntry for the server to be tracked, so it is possible (if difficult) to check whether “hanging” references to a SrvEntry are preventing the known workarounds from being effective.

Verifying whether this issue is active

One way of verifying whether this issue is active is to try the following:

Issue the command: logman start why -ets -p Microsoft-Windows-SMBClient Smb_Info -o why.etl

Try to access the mapped network drive.

Issue the command: logman stop why -ets

Issue the command: wevtutil qe /f:text /lf:true why.etl | findstr "SMB.send SMB.receive"

The final command will show selected items from the trace data; if there is a repeated sequence of “SMB send[0]: [NEGOTIAT]” items, then the client is repeatedly trying SMB2-Only negotiation, failing and retrying – this is the main characteristic of this problem.

Prospects

This problem affects Windows 2003 (among other SMB servers) and there is no doubt in my mind that (parts of) Microsoft is fully aware of the problem, its causes and its potential remedies. There is, however, a dearth of authoritative information on this topic easily findable in the Internet.

The common workarounds for these problems include deleting and recreating shares when problems occur, setting ProviderFlags to 1 and deleting the UseOptions value for persistent shares (whenever it is (re-)created). Without an option to disable the “SMB2-Only” optimization, there are no ideal solutions.