In the System event log you may find an event similar to the following:

Event ID 1001

Source: Microsoft-Windows-WER-SystemErrorReporting

Description: The computer has rebooted from a bugcheck. The bugcheck was: 0x0000009e (0x0000000000000000, 0x0000000000000000, 0x0000000000000000, 0x0000000000000000). A dump was saved in: C:\Windows\MEMORY.DMP.

Let's start out discussing what a STOP 0x9e is... Failover Clustering actively conducts health monitoring of many components and at different layers of a server, one of the attributes of a highly available system is to have the health detection mechanisms in place to detect when something goes wrong and to react. Under some conditions when an extreme failure occurs, the cluster service may intentionally bugcheck the server in an attempt to recover. The bugcheck will be a USER_MODE_HEALTH_MONITOR (9e) and invoked by the Failover Cluster kernel mode driver NetFT.sys.

The first and most important thing to understand is that this is normal cluster health detection and recovery, it is intended recovery behavior. It is not a “bug” in clustering, nor is it a bug in NetFT.sys... it is a feature, not a flaw. I say this, because the most common first troubleshooting step I see is that customers apply the latest hotfix for NetFT.sys… and that won’t help.

By far the most common reason for a 0x9e is that Failover Clustering is conducting health monitoring between the NetFT kernel mode driver to the user mode service. If NetFT stop receiving heartbeats, then user mode is considered to be non-responsive and clustering will bugcheck the box in an effort to force a recovery.

So the next question is what caused user mode to become unresponsive? In general, you can troubleshoot this like any other user mode hang… you can setup perfmon and look for memory leaks, etc… the most valuable diagnostic tool will be that when clustering bugchecks the box, you can capture a dump and analyze it to reach root cause. This will involve a call to Microsoft support to help debug the dump.

There are however a couple different conditions which can invoke a bugcheck 0x9e. In this blog I will discuss the different parameters logged in the Event ID 1001 and what they mean.

Decoding STOP 0x0000009E

The bugcheck code will have the following format with the following parameters.

Stop 0x0000009E ( Parameter1 , Parameter2 , Parameter3 , Parameter4 )

Parameter1 value meaning:

Process that failed to satisfy a health check within the configured timeout

Parameter2 value meaning:

Hex value which defines the time in seconds for the timeout which was hit. This will detail how long it took for the bugcheck to be invoked.

Parameter3 value meaning:

Value	Description
0x0000000000000000	The source of the reason for the bugcheck was not specified.
0x0000000000000001	The node has been bugchecked because the RHS process was attempting to gracefully close and did not complete successfully.
0x0000000000000002	The node has been bugchecked because a resource did not respond to a resource entry point call within the configured 'DeadlockTimeout' timeout. The node was configured to bugcheck by the 'DebugBreakOnDeadlock' registry key being set to a value of 3.
0x0000000000000003	The node has been bugchecked because of an unhandled exception with one of the cluster resources and when attempting to recover the RHS process did not terminate successfully within 20 minutes.
0x0000000000000004	The node has been bugchecked because of an unhandled exception with the Resource Hosting Subsystem (RHS) and when attempting to recover the RHS process did not terminate successfully within 20 minutes.
0x0000000000000005	The node has been bugchecked because a resource did not respond to a resource entry point call within the 'DeadlockTimeout' timeout (5 minutes by default) and an attempt was made to terminate the RHS process to recover. However, the RHS process did not terminate successfully within the timeout, which is four times the 'DeadlockTimeout' timeout (20 minutes by default).
0x0000000000000006	The node has been bugchecked because a resource type did not respond to a resource entry point call within the 'DeadlockTimeout' timeout and an attempt was made to terminate the RHS process to recover. However, the RHS process did not terminate successfully.
0x0000000000000007	The node has been bugchecked because of an unhandled exception with the Cluster Service (ClusSvc) and when attempting to recover the ClusSvc process did not terminate successfully within 20 minutes.
0x0000000000000008	The node has been bugchecked by the request of another node in the Failover Cluster
0x0000000000000009	The node has been bugchecked because the cluster service detected an internal subcomponent of the cluster service was being unresponsive. The system was configured to bugcheck by the 'HangRecoveryAction' setting being set to a value of 4
0x000000000000000A	The node has been bugchecked because the kernel mode NetFT driver did not receive a heartbeat from the user mode Cluster Service within the configured 'ClusSvcHangTimeout' timeout. The recovery action was configured to bugcheck by having the 'HangRecoveryAction' cluster common property being set to a value of 3 (default) or 4

Note: Parameter3 is a new value introduced in Windows Server 2012 R2 and will always be 0x0000000000000000 in previous releases.

Parameter4 value meaning:

Currently unused / reserved for future use, and will always be 0x0000000000000000

Thanks!
Elden Christensen
Principal Program Manager Lead
Clustering & High-Availability
Microsoft

Decoding Bugcheck 0x0000009E

Decoding STOP 0x0000009E

Parameter1 value meaning:

Parameter2 value meaning:

Parameter3 value meaning:

Parameter4 value meaning:

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List