Event Rules with Flooding Prevention

Index:

  • Important notice
  • Preface
  •  Functionality

Using this Management Pack

  • Creating Rules in the Management Pack
  • Functionality Check
  • Questions and Answers

Technical Details

 Important notice

I will not take responsibility for any damage the provided software might cause.

Preface

The idea for developing a Flooding Prevention for Event Rules was born because in our company we collect a lot of events for our in-house applications. Sometimes, there were thousands of events written to the NT Event Log and collected by the SCOM within a few seconds which caused the RMS to crash.

To prevent this, I at first used a Repeated Eventlog Monitor that turns unhealthy in case there are too many events written and potentially collected.

Additionally, I added a Recovery task that would create an override to disable the corresponding rule. In case it’s healthy again it would run a recovery task to delete that override again.

But, we couldn’t use this in a production environment due to the fact that the recovery task would need access to the SDK. Of course, that’s possible but it’s not the way we wanted it to be.

Conclusion: That version was impracticable. So, a new version was developed that does all SDK tasks on the Root Management Server.

Basically it watches the states of the Repeated Eventlog Monitors that run on the machines. In case they turn unhealthy, an Override would be created to disable the corresponding rule. In case they turn healthy again by a manual reset, the override will be deleted.

Things are a bit different in our environment so I had to work out a new version, the one you will see here, which is a lot more self-aware and user-friendly.

 

Functionality

There are 2 Management Packs, the EventRules.With.FloodingPrevention.Library which contains the required classes, discoveries and rules, and the EventRules.With.FloodingPrevention.Rules Management Pack in which the rules will be created that we are going to monitor.

There is 1 Discovery running on the RMS that reads all Event Rules from the Rules Management Pack and discovers them as an instance of the EventRuleWithFloodingPrevention Class.

Additionally, a Repeated Event Log Monitor will be created for each rule, using the same Event criteria as the rule.

There is 1 Rule running on the RMS which checks the states of the Monitors.

In case one is in an unhealthy state, an Override will be created that disables the corresponding rule.

If the monitor turns healthy again, the Override will be deleted.

Using this Management Pack

Management Pack Configuration

Basically there are 3 parameters you may want to override.

Repeat Count Threshold: How often a rule must collect an event within 60 seconds until it gets disabled.

Default: 5 (for Presentation purpose)

Monitor Creation Frequency: How often the Management Pack is checked for new rules.

Default: 300

Recommended: >60

Flooding Prevention Frequency: How often the monitor states will be checked and overrides will be created.

Default: 60

Recommended: >30

Creating Overrides is kind of user-unfriendly in the Authoring Console, therefore import both Management Packs (Sealed Library MP and unsealed Rules MP) in your Environment, preferably your LAB Environment.

 Adjusting Repeat Count Threshold and Monitor Creation Frequency:

In the Operations Console, open the Authoring Pane\Management Pack Objects\Object Discoveries and search for: Rule Discovery

 Right-Click it\Overrides\Override the Object Discovery\For all objects of class: Root Management Server

Adjust the Interval and the MonitorRepeatCountThreshold.

Select destination management pack: Event Rules with Flooding Prevention Rules

 Hit Apply – OK.

Adjust the Flooding Prevention Frequency:

In the Operations Console, open the Authoring Pane\Management Pack Objects\Rules and search for: Handle Overrides

Right-Click it\Overrides\Override the Rule\For all objects of class: Root Management Server

 Adjust the Interval Seconds.

Select destination management pack: Event Rules with Flooding Prevention Rules

 Hit Apply – OK.

 Creating Rules in the Management Pack

Basically, all you have to do is creating an Event Collection rule and as the target select Windows Computer or anything that is hosted by it. However – here’s a short walkthrough.

 Import the Management Pack and in the Operations Console, go to the Authoring Pane.

Go to Management Pack Objects – right-click Rules and select “Create a new rule…”

 Open Collection Rules\Event Based and select NT Event Log. (Also works with Alert Generating Rules but there’s no real purpose for that because of the Alert Suppression functionality.)

Select destination Management Pack: Event Rules with Flooding Prevention Rules.

Click Next.

 

Give the rule a name (Has been tested with special characters, works fine). Set the Category and select a Rule Target.

IMPORTANT: The rule’s target has to be something that is hosted by Windows Computer.

 

Next – The rest doesn’t need any further explanation. Configure the Event Log Type and build the Event Expression and click Create.

Functionality Check

Once the Discovery runs again, it’ll create new Instances of the Flooding Prevention Class.

To verify this, in the Operations Console open the Monitoring Pane, open the folder “Event Rules with Flooding Prevention” and open the States view.

You should now see a discovered instance for all “Windows Operating System” Computers.

Next, we create a few Events to get the Monitor to turn into a Warning state.

Launch the Command prompt from a machine where the Monitor runs and create new events that match the criteria you configured for your rule. Since I use a RepeatCountThreshold of 5, I create the Event 5 times.

In my case the command looks like this:

eventcreate /T Information /ID 514 /L Application /D “Test”

After a few seconds the state should turn into Warning state and we receive an Alert which you can find in the Alerts view.

Since the creation of the Override is done by a separate rule, it may take a while (depending on the Interval you chose for the HandleOverrides rule) until the Override is created.

Go to the Authoring pane\Management Pack Objects\Overrides and search for the rule name (in my case Dust 514).

You should find 2 overrides, one for the monitor and one for the rule (we are interested in the rule override).

And, as you can see, the Override was created specifically for the machine that caused the error.

Let’s check if it resolves properly by going to the State View again and resetting the state of the Monitor.

The next time the HandleOverrides rule runs, the Override will be deleted again.

Verify this by going to the Authoring pane, searching for the rule. The override should be gone.

Questions and Answers

What if the rule criteria changes?

The script will check if the criteria changed for existing monitors, therefore if you changed the rule criteria, the monitor will be updated as well.

The scripts do all kinds of automatic creation so what if a rule is gone?

A recycling job will run during the discovery that gets rid of unused groups, monitors and overrides.

Doesn’t this cause a lot of load on the RMS?

I couldn’t see much change in the performance on the RMS, besides: The SDK is very fast.

I have tested this with a few rules in our production environment targeting 250+ computers and the scripts took about 10 seconds.

What takes some time is the SDK connection setup.

Technical Details

Terminology:

PartialMonitoringObject:       Is a discovered class instance like a Windows Computer or a Logical Disk that has a health state which is changed by the monitor that targets it.

 Discovery – FloodingPreventionV3.Script.CreateMonitors.PS1

Checking Rules:

First, the script iterates through all rules and checks if data source is either Microsoft.Windows.EventCollector or Microsoft.Windows.EventProvider by using a string.contains(“Microsoft.Windows.Event”)

For each rule that passes this check, 3 sub-functions will be called:

#1 Discover-ClassInstances

In this function, we assume that the rule’s target is or is hosted by Windows Computer.

We then get all PartialMonitoringObjects of the class and create a new Classinstance of type EventRules.With.FloodingPrevention.Class.EventRuleWithFloodingPrevention

Class Properties:

DisplayName: The Display Name of the rule.

PrincipalName:The  current PartialMonitoringObject’s hosting Computer Name.

MonitoringRuleID:The GUID of the rule. [This is the key property for the class]

MonitoringRuleName:The XML ID of the rule.

 Note:Only one instance per computer will and can be discovered. If there are no instances found for the rule’s target class, the rule will be skipped.

#2 Create-FloodingPreventionMonitor

Will check if the monitor has been created already, in case it hasn’t, a new one will be created using the event criteria from the rule.

In case a monitor for this rule exists already, it’ll check if there were changes in the event criteria. In case there was a change, the monitor’s event criteria will be updated.

Information on the Monitor:

Target:Is the FloodingPrevention class (explained in #1)

Enabled:false, we’ll enable it in #3

Configuration: As mentioned already, it uses the event criteria from the rule. However, the schema is a bit different on the monitor so we run it through a translation sub-function so we get a proper configuration for the monitor.

                                    For the repeat count threshold we use the one provided in $monitorRepeatCountThreshold which can be configured in the Discovery.

#3 Create-Group

Creates a new group for each rule containing the class instances with the MonitoringRuleID of the current rule’s GUID.

Target:The FloodingPrevention class.

Membership:The MonitoringRuleID property has to match the rule’s GUID.

Additionally, a new Override will be created that enables the current rule’s monitor for this group.

Recycling Rules:

The recycling process runs every time the discovery runs and there were no changes to the Management Pack so far.

Basically, it checks the rule’s group for the member count, and in case the member count is 0, all associated objects in the Management Pack will get removed by 3 sub-functions.

#1 Delete-Override

Deletes the Override for the monitor and in case there are still overrides for the rule, they’ll get removed as well (they’ll get removed automatically in usual case).

#2 Delete-Group

Deletes the group as well as it’s discovery.

#3 Delete-FloodingPreventionMonitor

Deletes the Monitor.

Flooding Prevention – FloodingPrevention.Script.HandleOverrides.ps1

Check Health States:

It’ll get all PartialMonitoringObjects of the Flooding Prevention Class and checks their state.

# Warning State

Calls the Create-Override function which checks if there is already an override created for this instance. If not, an override will be created that disables the corresponding rule for the server which collected too many events.

# Healthy State

Calls the Delete-Override function which checks if there is an override created for this instance. If yes, the override will be deleted.

0 thoughts on “Event Rules with Flooding Prevention

  1. Pete Zerger

    Just finished initial read through of this one….This has some great potential in the real world.

  2. Andreas Zuckerhut

    I’ve attached the documentation in Word format now.

    I know that the format is quite messed up here. I thought I fixed it already…

  3. Raphael Burri

    Congratulations! Great MP and authoring example indeed. I have a feeling that I’ll be using it or its design principals quite often in the future.

    It’s interesting to see that all 3 medals went to MPs working in a similar way, using Powershell code run by the RMS to automate things. So in a way we’re all winner of the grand prize 🙂

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.