As promised, in this installment we'll look at a number of queries useful for daily operations and their significance in keeping your environment running smoothly. The queries we'll be looking at in this installment are among the core pieces of information you would do well to check every day over your morning coffee to see how your environment is doing.
- Agents Currently Down (grey agents)
- Total Database Size (OpsMgr and DW)
- Top Performance Data Insertions (per object / counter / instance)
- Number of Alerts Per Day
- Computers Generating Most Events
One thing all these queries have in common is that they are much more useful when used routinely as part of a daily process. And once you see a few of these in single click reports, I think you'll appreciate their convenience in maintaining a tidy OpsMgr environment.
Previous Installments
SQL Queries
Below are the queries with a brief description of their significance.
Agents Currently Down (grey agents)
Run this query for a list of agents are not reporting into their management server. Andreas Zuckerhut wrote a PowerShell query that does the same thing HERE.
SELECT bme.DisplayName,s.LastModified as 'LastModified (UTC)',
dateadd(hh,-5,s.LastModified) as 'LastModified (Local Time)'
FROM state AS s, BaseManagedEntity AS bme
WHERE s.basemanagedentityid = bme.basemanagedentityid
AND s.monitorid
IN (SELECT MonitorId FROM Monitor WHERE MonitorName = 'Microsoft.SystemCenter.HealthService.ComputerDown')
AND s.Healthstate = '3' AND bme.IsDeleted = '0'
ORDER BY s.Lastmodified DESC
Total Database Size (OpsMgr and DW)
Operations Manager requires 40% free space in its databases for nightly maintenance to run successfully. Retrieving SQL database size AND free space takes some work. There is a brilliant query that provides everything you need to check your database size and free space. Get it at http://www.mssqltips.com/tip.asp?tip=1426
Top 100 Performance Data Insertions (per object / counter / instance)
Performance data generally the top consumer of space in the Operational database. It's a good practice to have a look at the big consumers of database space and consider disabling collection rules for counters of no interest to your support teams. A must to get some input from your team on these decisions.
select top 100 pcv.ObjectName, pcv.CounterName, count (pcv.countername) as Total
from performancedataallview as pdv, performancecounterview as pcv
where (pdv.performancesourceinternalid = pcv.performancesourceinternalid)
group by pcv.objectname, pcv.countername
order by count (pcv.countername) desc
Number of Alerts Per Day
This is a good indicator of how well-tuned your OpsMgr management packs are tuned. I believe 50 alerts per day per 1,000 agents was the goal I saw mentioned by someone from Microsoft. Bottom line is tune tune tune until you have alert traffic down to mostly actionable alerts.
SELECT CONVERT(VARCHAR(20), TimeAdded, 102) AS DayAdded, COUNT(*) AS AlertsPerDay
FROM Alert WITH (NOLOCK)
WHERE TimeRaised is not NULL
GROUP BY CONVERT(VARCHAR(20), TimeAdded, 102)
ORDER BY DayAdded DESC
Computers Generating Most Events
This query quite simply shows us which computers are consuming the most database space with collected events. Run this one against either the Operational database (OperationsManager).
SELECT top 20 LoggingComputer as ComputerName, COUNT(*) AS EventTotals
FROM EventallView with (NOLOCK)
GROUP BY LoggingComputer
ORDER BY TotalEvents DESC
Remaining Installments
At this point, we're down to processes and reports where these queries come in handy...
- Next time, we will look at a troubleshooting flowchart to help define a process for leveraging this information
- We'll then wrap up the series with a couple of sample reports to demonstrate how to more conveniently collect and present this information