Warning – This process is a tangled web, as such processes with a data warehouse often are, and is something I dig through for my own education from time to time. To that end, I thought I’d document as I explore so we can all learn together. I’m going to call out some interesting data I see here, and perhaps try to work with some of my peers to map database maintenance processes further in future installments.
Today I am looking around database maintenance processes for the Data Warehouse.
DISCLAIMER: Before you read on, I make no warranties or promises here, and do NOT endorse any modification of internal maintenance processes of OpsMgr databases. If you do so, you do so at your own peril.
As I mentioned a long time ago in a blog post (and on this page in the System Center WIKI), the OperationsManager database is designed to be self-maintaining. The links mentioned actually contain names of the rules and stored procedures that are called by them. The maintenance rules for the OperationsManager database are contained in the Microsoft System Center Internal management pack (Microsoft.SystemCenter.Internal.mp)
As has likely been mentioned briefly elsewhere, the Data Warehouse is also designed to be a self-maintaining database. Maintenance processes for the Data Warehouse is defined in the Data Warehouse Internal Library (Microsoft.SystemCenter.DataWarehouse.Internal).
The entire maintenance process (for aggregation, grooming and optimization anyway) appears to stem from a single rule that executes every 60 seconds.
The Maintenance Rule for the Data Warehouse
Data warehouse maintenance is defined in a rule called Standard Data Warehouse Data Set maintenance rule. It is targeted to an object class named Data Warehouse standard Data Set. This class is defined in the Data Warehouse Library management pack. The rule is executed every 60 seconds (based on frequency parameter), and it calls a number of other stored procedures to perform a number of maintenance functions…more on this in a minute.
|Figure 1. Data Warehouse Standard Data Set Maintenance Rule
||Figure 2. Data Source used for the Warehouse Standard Data Set Maintenance Rule
The Stored Procedure(s)
The stored procedure called to kick maintenance off is StandardDatasetMaintenance. If you have a look at this stored procedure, you’ll notice the following:
The procedure expects a dataset ID is passed as an argument (this is also apparent in the rule itself). You can find the dataset IDs in the StandardDataSetAggregation table in the OpsMgr data warehouse.There are several data sets to deal with.
You’ll see data set for Performance and AEM data. You’ll see three entries for each data type, each with their own aggregation type. You’ll also see in this table the names of the stored procedures responsible for data aggregation, grooming and deletion. The aggregation types (shown in the ‘AggregationTypeID’ column of StandardDataSetAggregation table) can be of the following values 0-Raw, 10-subhourly, 20-hourly, 30-daily. You’ll notice there are no sub-hourly data sets.
Figure 3. Snapshot of the StandardDataSetAggregation table (click to enlarge)
The Staging Area
The DW has a staging process that inserts (stages) new data in smaller tables and then is inserted into their final place through background processes. This allows data insertion to complete quickly and the consolidation happens in the background. Staging in this fashion is a data warehousing concept and not at all unique to OpsMgr. You can spot staging tables by their names.
Stored Procedures Called from StandardDatasetMaintenance
We can see numerous other stored procedures called from StandardDataSetMaintenance that handle numerous database maintenance processes
Staging – About line 35, we see the call to the procedure to process data in the staging area. Notice the @datasetID, which indicates each call will be to process a specific type of data (e.g. – event, alert, performance, state). Which staging table processed depends on the
Which type of data is processed by this stored procedure depends on the datasetID passed to it. There are stored procedures for all types of data, including the following of interest in this discussion:
- dbo.AemProcessStaging and dbo.AemEventProcessStaging
For purposes of getting to the point, I’ll skip down to around line 75 in StandardDataSetMaintenance where we see some other stored procedures being called into action.
- StandardDataSetAllocateStorage – This one deals with table size and creates new tables and indexes as necessary when tables holding aggregated data reach their maximum allotted row count.
- StandardDataSetOptimize – This one check indexes and determines what needs to be optimized / rebuilt.
- StandardDataSetGroom – Grooms aged data.
- StandardDataSetAggregate – Performs the rollup of raw data into hourly and daily aggregate data sets. Data aggregation happens on an hourly basis, which is why your reporting only has about an hour of lag from the present time.
NOTE: This is by no means anywhere near a complete list of stored procedures participating in the maintenance process. Each of the four stored procedures mentioned above call other stored procedures. There are other stored procedures called to delete and add indexes, create covering views to include data in new tables created in the aggregation process, etc.
This is all for this installment. Next time I’m stuck on a long conference call, I’ll work on detailing some of the sub-processes mentioned above to see what useful information can be discovered.