I’ve been on a Perry Mason kick lately. Could not help name this blog post in this format.
I was given the task to figure out why alerts were not being sent out. The management group was on a secure network. The two management servers in the management group MS01 and MS02, have access SMTP resources. There are several gateway servers that were also a part of this management group that could not access the SMTP resources, GWY01 and GWY02.
The Notifications Resource Pool by default includes all management servers and gateway servers as members of the pool, because the pool is configured to enable automatic membership. In most environments this would be ok, unless the Operations Manager infrastructure is in a highly secure network where only a select set of machines would have access to SMTP host that sends out alerts, or the gateway servers are in a remote DMZ network where they can not access the SMTP host. In either of these scenarios the Notification Resource pool will need to be reconfigured to use manual membership, then the membership will have to be modified to remove the gateway servers or management servers that can not access the SMPT resources (or SIP resources).
Managing the default set of resource pools can be done thru the Operations Manager 2012 console, or thru the Operations Manager PowerShell console. To change the Notification Resource Pool to use manual membership, go to the Operations Manager 2012 console and open the Administration tab, and select the Resource Pools icon. Right click on the Notifications Resource Pool, and select Manual Membership. You can also right click on a resource pool to view the Resource Pool members.
After modifying the Resource Pool, on MS01, so that the only pool members would be MS01 and MS02, I expected to see some alerts flowing, after stopping the web server on one of the management servers. While waiting, I checked the membership of the Notification Pool, in the OpsMgr console on MS02. It seems that MS02 did not see the changes that I could still see on MS01, which is very odd, as those changes are stored in the OpsMgr db. Alerts are still not flowing. Perhaps the Console cache is just stuck? So I clear the OpsMgr console cache on MS02, however there is no change. Perhaps the SDK service is having a momentary lapse in reason, so I restart the three OpsMgr Services on MS02, (I actually restarted the OpsMgr services a few times just to make sure) but again the notification resource pool is still showing gateway servers. Perhaps the OpsMgr Server Queue was stuck on MS02? So I stop the OpsMgr Services on MS02 and clear the Health Service Store directory, then start the three OpsMgr services back up. Then I open the OpsMgr console on MS02. Now I am able to see the changes that were made on MS01 30 minutes prior, where only MS01 and MS02 were the only members of the Notification Resource Pool. Now alerts are flowing like a champ.
If your management group has multiple management servers it would be a good idea to verify your changes are seen by the other management servers in the management group, if your changes are not working as you would expect them too. Hopefully this might help someone in the near future.