|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Sun, Jun 21, 2009 12:11 PM :: Rank: 51 |
Author
|
|
|
Points: 950
Level: System Center Hero |
Thank you for your rating!
|
* locate "grey" hosts
* verify fixes for OpsMgr / Agents
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Sun, Jun 21, 2009 3:24 PM :: Rank: 56 |
Author
|
|
|
Points: 1183
Level: System Center Specialist |
Thank you for your rating!
|
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Sun, Jun 21, 2009 11:38 PM :: Rank: 38 |
Author
|
|
|
Points: 65622
Level: System Center Expert |
Thank you for your rating!
|
Daniele, The MS-delivered OpsMgr Health Check is a great service for Premier customers I have no doubt. I think we were envisioning something likely a bit less complex that would be available to the other customers (often in smaller environments) as an ad-hoc tool for assessing health.
This would be great for community support scenarios and for customers outside the Premier support plans without budget for pro services engagements for assessing the health of the SCOM deployment.
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Mon, Jun 22, 2009 5:53 AM :: Rank: 68 |
Author
|
|
|
Points: 1183
Level: System Center Specialist |
Thank you for your rating!
|
True, I suppose I felt sorry I cannot share internals of what we do in the MS "health check" more than a certain extent... but, as you mentioned, a lot of the information is already out there in blogs... :-)
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Mon, Jun 22, 2009 7:23 AM :: Rank: 89 |
Author
|
|
|
Points: 42748
Level: System Center Expert |
Thank you for your rating!
|
An OpsMgr Health Check Script should perhaps also include a check to find 1) any targeting mistakes (groups targeted instead of classes) and 2) If possible, any object discoveries created with a dangerously low interval.
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Mon, Jun 22, 2009 10:42 AM :: Rank: 89 |
Author
|
|
|
Points: 27734
Level: System Center Expert |
Thank you for your rating!
|
Query for all scripts, collect them, run them, catch the errors. The eventlog doesn't provide you with too much information regarding scripterrors.
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Mon, Jun 22, 2009 11:04 AM :: Rank: 86 |
Author
|
|
|
Points: 6866
Level: System Center Specialist |
Thank you for your rating!
|
- Could check that the MP's are getting regularly updated from DB to RMS and from RMS to Management Servers..
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Mon, Jun 22, 2009 1:01 PM :: Rank: 68 |
Author
|
|
|
Points: 40804
Level: System Center Expert |
Thank you for your rating!
|
My view is that people on this site are in the process or have in fact already purchased a System Center product. It is the hope that communities like this one remove as many obstacles as possible to ensure the System Center experience is focused on its primary function.
We welcome as much talent sharing as each individual wants to offer knowing there is no reward (other than a warm glow). The input already by MS personnel has been great and welcomed and I hope will continue.
Everybody has something to offer.
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Mon, Jun 22, 2009 7:06 PM :: Rank: 44 |
Author
|
|
|
Points: 40804
Level: System Center Expert |
Thank you for your rating!
|
I have been drawing more thoughts to the Health Check tool and have the following to add;
• We can check for a failed backup as well as a successful backup but we never monitor for the fact a backup is even being attempted.
• We also check the fact the Transaction logs are getting close to be full
• We could check that the SQL Server is setup correctly like Auto Grow is False etc
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Tue, Jun 23, 2009 9:17 AM :: Rank: 12 |
Author
|
|
|
Points: 151
Level: System Center Hero |
Thank you for your rating!
|
could check if the deployed MP's are up to date
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Wed, Jun 24, 2009 10:02 PM :: Rank: 13 |
Author
|
|
|
Points: 6007
Level: System Center Specialist |
Thank you for your rating!
|
An OpsMgr Health Check Script should perhaps also include a check to find 1) any targeting mistakes (groups targeted instead of classes) and 2) If possible, any object discoveries created with a dangerously low interval.
The second suggestion would really help. Again today, I found yet another 3rd party MP running discoveries every 5 minutes on every server we manage. It would actually be a great monitor or something.
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Wed, Jun 24, 2009 10:25 PM :: Rank: 38 |
Author
|
|
|
Points: 65622
Level: System Center Expert |
Thank you for your rating!
|
Funny you should mention these. I am actually sitting here right now working on the Powershell to loop through and find rules targeting singleton classes, which I think will catch the bulk of those. 2) is also a big one for me too - I was thinking for in-house MPs, as the MPBPA does not catch those. I suppose it's equally viable for 3rd party MPs as well.
Nice suggestions on the v10 Wish List too. 
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Fri, Jun 26, 2009 5:37 AM :: Rank: 59 |
Author
|
|
|
Points: 1183
Level: System Center Specialist |
Thank you for your rating!
|
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Fri, Jun 26, 2009 11:13 AM :: Rank: 86 |
Author
|
|
|
Points: 295
Level: System Center Hero |
Thank you for your rating!
|
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Fri, Jun 26, 2009 12:55 PM :: Rank: 114 |
Author
|
|
|
Points: 295
Level: System Center Hero |
Thank you for your rating!
|
We have had a similar need in our environment so I created a "OpsMgr DB Health Report" that we get daily. It is a collection of queries that I wrote or grabbed from other sites.
Below are some of the items that we check, these are not in Pete's orignal list but each one of these was below have been used in troubleshooting issues in production environments.
Check Ops DB Grooming History
Remember all of the Grooming fun with MOM 2005? OpsMgr is much cleaner and essentially does a Truncate to remove old data instead of having to transfer it to the Warehouse. If grooming isn’t ran it can cause too much data to be stored in the OpsDB and affect performance. There are two easy ways to spot this:
1. Query the internal job history table, you should see a successful job for each day, note this is in UTC.
SELECT InternalJobHistoryId, TimeStarted, TimeFinished, StatusCode, Command, Comment
FROM InternalJobHistory
WHERE (DATEDIFF(day, TimeStarted, GETDATE()) < 7) AND (NOT (Command = N'Exec dbo.p_DataPurging'))
ORDER BY TimeStarted DESC
2. From the PartitionTables check the Partition Start and End columns and make sure they are current spanning one day.
select top 10 * from partitiontables Order by partitionendtime desc
For more info on grooming check out a couple of Steve Rachui's blogs:
Large Table Query
This is useful for spotting a myriad of issues including localized text problems (still can happen in R2, see Steve’s post ), grooming problems and Perf/Event storms. I recommend creating a report from this query and saving it daily for comparison.
SELECT top 15 so.name, si.rowcnt as row_count,
8 * Sum(CASE WHEN si.indid IN (0, 1) THEN si.reserved END) AS data_kb, Coalesce(8 * Sum(CASE WHEN si.indid NOT IN (0, 1, 255) THEN si.reserved END), 0) AS index_kb, Coalesce(8 * Sum(CASE WHEN si.indid IN (255) THEN si.reserved END), 0) AS blob_kb FROM dbo.sysobjects AS so JOIN dbo.sysindexes AS si ON (si.id = so.id)
WHERE 'U' = so.type GROUP BY so.name, si.rowcnt
ORDER BY data_kb DESC
Grey Agents Report
Agents that are Grey in the console:
SELECT ManagedEntityGenericView.DisplayName, ManagedEntityGenericView.AvailabilityLastModified
FROM ManagedEntityGenericView
INNER JOIN ManagedTypeView ON ManagedEntityGenericView.MonitoringClassId = ManagedTypeView.Id
WHERE (ManagedTypeView.Name = 'microsoft.systemCenter.agent') AND (ManagedEntityGenericView.IsAvailable = 0)
ORDER BY ManagedEntityGenericView.DisplayName
Check for Overrides in Default MP
select aov.name, parenttype, overrideableparametername, value, overridetype, aov.lastmodified from AllOverrideView aov
inner join ManagementPackView mpv on aov.managementpackID = mpv.Id
where mpv.name = 'Microsoft.SystemCenter.OperationsManager.DefaultUser'
order by aov.lastmodified DESC
SQL broker Enabled
The most obvious sign of SQL Broker not being enabled is Discovery failing. In R2 it will actually display a warning message during discovery but the fix often requires stopping the services on the RMS and taking the Database into SIngle user mode to enable SQL Broker. It would be nice to get a critical alert when this occurs, but it can be spotted easily enough by running using this query:
SELECT is_broker_enabled FROM sys.databases WHERE name = 'OperationsManager'
I also have and some queries I developed that take the typical Perf and Event count reports and adds by management pack, but I think that is out of scope for this discussion. I will post some more in the near future and hopefully get some time to create a supplemental report set.
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
RE: Requirements for a SCOM 2007 Health Check Script?
Posted: Mon, Jun 29, 2009 10:03 PM :: Rank: 59 |
Author
|
|
|
Points: 65622
Level: System Center Expert |
Thank you for your rating!
|
Here is some T-SQL for identifying Rules and Unit Monitors targeted to a GROUP in error. This doesn't get us to an absolute list of culprits, but does return a very small result set (including name, target and MP) very quickly in which we can easily identify mistakes. You will find a couple of singleton classes which are not groups in some internal MPs.
If we were to query a level deeper to identify base class, we could be 100% accurate (assuming we correctly identified all base classes for a group).
I tried to identify the same information with Powershell, but found it took much longer to execute, thus the move to T-SQL to minimize impact to MG resources. (PoSh took a couple minutes, T-SQL took only a couple seconds).
For Unit Monitors
SELECT TOP (100) PERCENT dbo.ManagedTypeView.DisplayName AS Target, dbo.MonitorView.DisplayName AS MonitorName,
dbo.ManagementPackView.DisplayName AS MP, dbo.ManagedTypeView.Singleton
FROM dbo.MonitorView INNER JOIN
dbo.ManagementPackView ON dbo.MonitorView.ManagementPackId = dbo.ManagementPackView.Id INNER JOIN
dbo.ManagedTypeView ON dbo.MonitorView.TargetMonitoringClassId = dbo.ManagedTypeView.Id
WHERE (dbo.ManagedTypeView.Singleton = 1) and dbo.MonitorView.IsUnitMonitor = 1
ORDER BY Target
For Rules
SELECT TOP (100) PERCENT dbo.ManagedTypeView.DisplayName AS Target, dbo.RuleView.DisplayName AS RuleName,
dbo.ManagementPackView.DisplayName AS MP, dbo.ManagedTypeView.Singleton
FROM dbo.RuleView INNER JOIN
dbo.ManagementPackView ON dbo.RuleView.ManagementPackId = dbo.ManagementPackView.Id INNER JOIN
dbo.ManagedTypeView ON dbo.RuleView.TargetMonitoringClassId = dbo.ManagedTypeView.Id
WHERE(dbo.ManagedTypeView.Singleton = 1)
ORDER BY Target
Let me know if you have another path to this info.
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
RE: Requirements for a SCOM 2007 Health Check Script?
Posted: Thu, Jul 09, 2009 3:38 AM :: Rank: 99 |
Author
|
|
|
Points: 65622
Level: System Center Expert |
Thank you for your rating!
|
OpsMgr Health Check Requirements Summary and Next Steps
Here's a categorized summary of your suggestions mentioned as important elements of an OpsMgr Health Check. Many of you already a quite a few scripts, queries (or links to them) in your posts. I've left the pointers out, as we have some additional round of discussion on the "what to check" before we talk about the "how to check".
I've put our names by your suggestions so we can ask one another for clarification or further justification if needed.
Next Steps
As a next step, I would suggest we;
- Take a look at the list below and identify any critical checks we think are missing.
- Ask the owner for clarirfication / justification if necessary
- Throw out ideas on how this should be delivered (Powershell script, Management Pack, etc)
Result should be a refined list for which we can begin to review data collection options and discuss how to best deliver a report in an easily consumable format. Powershell seems a likely choice, but don't want to make any assumptions at this stage.
Respond to this thread with your additional thoughts, questions and suggestions
Categorized List of Your Suggestions
1. RMS / MS / Mgmt Group Health, Configuration and Connectivity
1. Look for database latency - High count of event 2115 in the RMS / MS OpsMgr Event Log (Pete)
2. Collect agent count reporting to each MS & RMS (Pete)
3. Retrieve recent warning and critical events from RMS OpsMgr Event Log (Pete)
4. Verify patch levels on RMS / MS (Ziemek)
5. Check that the MP's are getting regularly updated from DB to RMS and from RMS to Management Servers.. (Sameer)
6. Check if deployed MPs are up to date (Holger)
7. Check for overrides in the Default MP (Matthew)
2. Management Pack Configuration and Versioning
1. Check for targeting mistakes (rules and monitors targeted to groups) (Tommy)
2. Check for object discoveries created with dangerously low intervals (Tommy)
3. Query for all scripts, collect them, run them, catch the errors. The eventlog doesn't provide you with too much information regarding scripterrors. (Tenchuu)
3. SQL Database Health, Configuration, Maintenance and Grooming
Configuration
1. Check proper database sizing based on monitored server / device count (Pete)
2. Check SQL configuration against supported configurations / best practices (Pete)
3. Verify SQL Broker is enabled (Matthew)
4. Check that the SQL Server is setup correctly like Auto Grow is False etc(Simon)
Operational Health
1. Check for a failed backup as well as a successful backup but we never monitor for the fact a backup is even being attempted. (Simon)
2. Check the fact the Transaction logs are getting close to be full (Simon)
3. Check Operational DB grooming history (Matthew)
4. Large Table Query (Matthew)
4. Agent Health and Configuration
1. locate "grey" hosts (Ziemek, Matthew)
2. Verify patch levels on RMS / MS (Ziemek)
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Fri, Jul 10, 2009 9:37 AM :: Rank: 63 |
Author
|
|
|
Points: 40804
Level: System Center Expert |
Thank you for your rating!
|
This is to much effort not to move forward with production so lets get going on this, I ready to take on my bit and whatever else.
So here's the role call, Pete and I will put the bulk of this together would anyone else like to help?
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
RE: Requirements for a SCOM 2007 Health Check Script?
Posted: Fri, Jul 10, 2009 10:00 AM :: Rank: 66 |
Author
|
|
|
Points: 482
Level: System Center Hero |
Thank you for your rating!
|
can help you guys out for sure. let me know.
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|
|
|
Re: Requirements for a SCOM 2007 Health Check Script?
Posted: Fri, Jul 10, 2009 11:20 AM :: Rank: 64 |
Author
|
|
|
Points: 6866
Level: System Center Specialist |
Thank you for your rating!
|
sure. .I am in.. Let me know how I can help on this..
|
|
| Your Reports Help Protect the Community |
|
The community depends on each member to help keep Answers a safe and positive place. Do your part by using the form below to report Q&A that violates the Community Guidelines.
|
Additional Detail(optional)
|
Report Abuse |
|