[SCOrch] Automatically Reset Unhealthy Unit Monitors (when alert closed in error by a human)

Here is an automated resolution for a common issue I  think has a place in many System Center 2012 – Operations Manager (OpsMgr) environments that I put together on a customer site as a sample some time ago I thought worth sharing.

The Issue: A user (OpsMgr Admin, Operator, etc.) closes an alert associated with a monitor. Monitors are state-aware and are almost always configured to automatically close the alert when the error condition is resolved and the monitored entity returns to a health state. The problem comes when an alert generated by a monitor is closed without the error condition itself being resolved, you now have a blind spot in your monitoring deployment; a monitor in an unhealthy state that will never raise another alert to warn you of the underlying error condition until the monitor returns to a healthy state and back again. This may or may not ever happen without human intervention, which becomes unlikely when the alert is no longer present as a reminder of the error condition.

Automated Resolution: The easy way to fix this is to reset the unit monitor to healthy, but an automated solution is the only tenable solution to the issue, as manual detection of the situation is operationally unlikely. Using a runbook in System Center 2012  Orchestrator (SCOrch), you can monitor for closed alerts generated by a unit monitor that were not closed.

The runbook looks like this:


The monitor performs the following actions:

1. Monitors for alerts entering the Closed resolution state where:

    • IsMonitorAlert equals True
    • LastModifiedBy  does not equal System*

*This last criteria helps reduce processing overhead by ignoring alerts closed by OpsMgr itself, as these are presumably correct. It’s going to save a lot of PowerShell instantiations on your runbook server at minimum, even if you’re checking if the monitor needs to be reset before resetting it in the script.

Runbook Configuration

The third parameter (LastModifiedBy) is important to ensure we don’t try to reset the state of every monitor-based alert, just those closed by something other than “System”.  LastModifiedBy = System would generally indicate closure by OpsMgr or other programmatic means (in which case we’re assuming we’ll take appropriate precautions).

In short, we want to reset unit monitors for which the alert was likely closed by a human.

This doesn’t cover every possible situation where something is being used to closed monitors that shouldn’t be, but will catch what is by far the most common cause of this situation – inappropriate human intervention. You’ll want to review the ‘TheModifiedBy’ property on other automatically closed alerts just to be sure, but it worked well in my tests.


Sample Script (for the Run .Net Script activity)

This is the sample script to paste into the Run .Net Script activity. The values in red are the those you use to replace:

  • ms1.contoso.com – This is the name of one of your OpsMgr management servers
  • {ID from “Monitor Alert”} – This is Orchestrator subscription data to pass in the ID of the monitor from the Monitor Alert activity

#—————-Begin Sample Script ————–#

# Import Operations Manager Module and create Connection
Import-Module OperationsManager

#Retrieve alert maching criteria
$AlertId=”{ID from “Monitor Alert”}
$alerts = Get-SCOMAlert | where { $_.id -eq $AlertId}

#Loop through the trigger alerts (should really only be one)

foreach ($alert in $alerts)


 # Retrieve IDs of the monitor, target class and instance
$MonitorID = $alert.monitoringruleid
$TargetClassID = $alert.monitoringclassid
$ObjectID = $alert.monitoringobjectid

#Retrieve the monitor, target class and instance
$monitor = Get-SCOMMonitor | where {$_.id -eq $MonitorID}
$monitoringclass = Get-SCOMClass | where {$_.id -eq $TargetClassID}
$monitoringobject = Get-SCOMMonitoringobject -class $monitoringclass | where {$_.id -eq $ObjectID}

#Reset Monitor
$monitoringobject | foreach{$_.ResetMonitoringState($monitor)}


#—————-End Sample Script ————–#

After you get the Monitor Alert activity in your runbook connected to your OpsMgr environment, check the runbook in and start it up to begin monitoring and automatically correcting this issue.

Additional Resources

On a related note, here are a few sample scripts for the OpsMgr 2012 Command Shell

OpsMgr 2012:See Who is Currently Connected to Your Management Group via PowerShell

Cloning Notification Subscriptions in Operations Manager 2012 using PowerShell [sample script]

OpsMgr 2012: All my Java apps require manual discovery – where do I get BeanSpy and PowerShell install scripts?

OpsMgr 2012: Reset Unit Monitors in Bulk with PowerShell

OpsMgr 2012: Find Computers without the Active Directory Helper Object (OOMADS) with PowerShell

OpsMgr 2012: Disabling Rules and Monitors in Bulk in PowerShell

OpsMgr 2012: Group Maintenance Mode via PowerShell (the way it should be)

OpsMgr 2012: Running a Task in Bulk Using PowerShell

OpsMgr 2012: Automating Agent Discovery and Deployment with PowerShell [sample script]

OpsMgr 2012: Identifying Computers in Active Directory without an OpsMgr Agent Installed [sample script]

OpsMgr 2012 Quick Tip: Finding servers experiencing the most heartbeat failures with PowerShell

10 thoughts on “[SCOrch] Automatically Reset Unhealthy Unit Monitors (when alert closed in error by a human)

  1. Pingback: SystemCenterCentral: Automatically Reset Unhealthy Unit Monitors (when alert closed in error by a human) | Cloud Administrator

  2. Profile photo of Erik Andersen

    Kevin: I got the same error..

    I created the runbook, added a email feature, started it and tried closing a warning.

    This starts an endless loop of instances on the runbook job, all on the same alert.

    Has anybody fixed that?

    Either it updates the alert when resetting the health monitor thus resulting in another instance called

    or it has a loop in the monitoring feature of the IP, where it forgets to set some “read”-variable..

  3. Profile photo of Erik Andersen

    Okay, I think I fixed it, but it is in no way pretty.

    In the Monitor Alert, I set the LastModifiedBy clause to be “Does not match pattern” entering ^(System|Auto-resolve|DOMAIN\\Orchestratoruser)$ (replace DOMAIN and Orchestratoruser with whatever you need.
    When everything has run, I added Update Alert and close it (yep.. close a closed alert..) – just so the LastModifiedBy get set to the user it is set to.

    It at least stopped the loop for me. though I can’t quickly see who closed it anymore.

  4. Profile photo of Arthur SilvanyArthur Silvany


    There is a case that the  alert monitor is closed by “maintenance mode” or “auto-resolve”, then the monitor is reset automatically. Is it possible to insert more condition in the monitor alert properties ?



  5. Profile photo of JaysonJayson

    So this is where I get comfused.

    $AlertId=”{ID from “Monitor Alert”}”

    Won’t this change for every single alert? How can input a specific ID if that will be different for each and every monitor within SCOM?

    I’ve tried using specific ID’s. However, when I try and run the runbook it errors out on me and all it states is that “Monitor failed”.

  6. Profile photo of Josh AndersonJosh Anderson

    So in addition to changes listed by Erik:

    LastModifiedBy clause to be “Does not match pattern” entering ^(System|Auto-resolve|DOMAIN\\Orchestratoruser)$


    Adding an Update Alert step (in mine I added a custom field instead to stop the looping)

    I also recommend changing the line in the script from this:

    $alerts = Get-SCOMAlert | where {$_.id -eq $AlertId}

    To this:

    $alerts = Get-SCOMAlert -Criteria “Id = '$AlertId‘”

    The -Criteria parameter is much more efficient (takes up fewer resources and runs a lot faster). This is similar to using -Filter for the Get-WMIObject cmdlet.

  7. Profile photo of Josh AndersonJosh Anderson

    The comment stripped out the escape characters on me. For the modified Get-SCOMAlert, the syntax surrounding the $AlertId variable is surrounded by escaped single quotes.

    Backtick Single Quote $AlertID Backtick Single Quote

    I hope that is clear enough.


  8. Profile photo of AndyAndy

    Hello, the easier way is to work with the Monitor Alert and Update Alert Function from SCOM IP, then you need no Powershell-Script. Between both Activities you can work with MonitorObjectHealthState on the Link. Only when the HealthState is Critical or Warning, the Update-Alert Activiy starts.

    (SCOM 2012 R2)

Leave a Reply