System monitoring Configuration Manager, which is used to manage other systems, has its dark side. Of course, the administrator of this system through the Monitoring part of the administrative console has insight into many aspects of the system's operation, but an outside look from the monitoring system can be useful.
In practice, it appears that the monitoring system persistently marks as "ongoing in error" components that, although they have recorded an error, have returned to normal operation. This gives the impression that some of the monitoring is secretly using rules instead of monitors.
Closer inspection reveals that we are dealing here with monitors, but very specifically constructed. For example, one of the monitors, the configuration of which is shown below, should close the alert when it returns to its normal state, that is, when the causes causing the error have ceased:
Unfortunately, when we look at the tab Health we will find that it can automatically find itself in the "bad" (Critical) state of the monitor - we can not count on it to heal itself, because here we have only "Manual Reset":
Indeed, therefore, we are dealing with a monitor that in practice behaves more like a rule. This is not a desirable situation. Configuration Manager resets error counters at midnight, so a temporary loss of communication with some system component that triggers a log entry is "forgiven" after a day at the furthest, however Operations Manager holds this state even many days after the whole thing is over, thus showing a false state.
Are there many such monitors in the RPMÄ™ Management Pack-a to Configuration Manager? We can investigate this with the command:
PS C:\> (Get-SCOMManagementPack -Name "*ConfigurationManager.Monitoring" |Get-SCOMMonitor |Where {$_.OperationalStateCollection -like "*ManualReset*"}).count 16
So it turns out that we have as many as 16 such monitors that mimic the rules in their behavior, interestingly in their descriptions we often have the term "This rule":
DisplayName | Description |
Component manager fails to access site system | This rules generates alert if the compoenent manager on site server cannot access site system. |
WSUS version mismatch | This rule generates alert if the WSUS server version is not the required version |
File Dispatch Manager Connection Monitor | This monitor checks that the file dispatch manager can connect to and the site server. |
Site component manager fails to update Active Directory objects | This rule generates alerts if the site component manager fails to update objects in Active Directory. |
Sender fails to connect to a remote site over LAN advanced security | The rule generates alert when a sender fails to connect to a remote site over the LAN under advanced security. |
Fail to execute system summary task | This rule generates alert if system summary task fails |
Management Point WINS unregistration monitor | This monitor checks if the management point successfully unregisters with the local WINS server |
Distribution manager fails to access network | This rule generates alert if the distribution manager on site server fails to access network abstraction layer |
Site component manager fails to read Active Directory objects | This rule generates alerts if the site component manager fails to read objects in Active Directory. |
Fail to configure proxy setting on WSUS server | This rule generates alert if the WSUS control manager fails to configure proxy setting on WSUS server |
Site server fails to execute a maintenance task | This rule generates alert if site server fails to execute a maintenance task. |
State Migration Point HTTP Response Monitor | This monitor checks if the state migration responds to HTTP requests, using the SMP_CONTROL_MANAGER. |
This rule generates alert when the WSUS configuration manager fails to publish client to the WSUS server | This rule generates an alert when the WSUS configuration manager fails to publish client to the WSUS server. |
Auto-started component stopped unexpectedly | This rule generates alert if the SMSExec detects an auto-started component is stopped unexpectedly. |
Management Point WINS registration monitor | This monitor checks if the management point successfully registers with the local WINS server |
Fail to subscribe to or get update categories and classification | This rule generates alerts if the WSUS configuration manager failed to subscribe to or get update categories and classification on a WSUS server. |
If you find it difficult to accept this "rule-like" behavior of some monitors, you can disable them through the Override mechanism, or wait for the next article, in which I will describe how you can change the behavior of these monitors without giving up monitoring the events they catch.