SCOM Standard Dataset maintenance workflow

As I’m going through my preparations for TechEd, I’m refreshing my knowledge about System Center Operations Manager. A while ago I went through all DW SPs involved into standard dataset maintenance and made some short notes about the implementation of this workflow. Here they are.

What’s a dataset?

Dataset is a set of tables, stored procedures and corresponding SCOM Write Actions. Dataset should be defined in the Management Pack using DataWarehouseDataSet element. SCOM Datasets are used to store historical monitoring data of certain structure. Datasets available out-of-box:

Dataset Database Schema Is standard?
Performance Perf Yes
State State Yes
Event Event Yes
Alerts Alert Yes
Client Monitoring CM Yes
APM apm No

 

What’s a STANDARD dataset?

SCOM provides a framework which implements a standardized approach for performing routine maintenance tasks like staging are processing, aggregation, partitioning, grooming and index maintenance. Dataset creation script should add necessary information into standard dataset configuration tables to enable correct functioning of maintenance workflow.

How does standard dataset processing work?

Standard dataset processing is implemented as a set of SQL Server Stored Procedures which are executed by “Standard Data Warehouse Data Set maintenance rule”. Maintenance frequency is 60 seconds by default.

The maintenance rule is targeted to “Standard Data Set” class and takes dataset ID as a parameter. The rule calls StandardDatasetMaintenance stored procedure.

The workflow.

Note #1: this is a rough description of what happens inside of StandardDatasetMaintenance stored procedure. Some details may be omitted.

Note #2: Table names and field names are italic, stored procedure names are red, calls for stored procedures defined by datasets themselvs are green.

StandardDatasetMaintenance

  • StandardDatasetProcessStaging
  • Optimization
    • Set lock – ‘Dataset_Maintenance’
    • If running for less than 30 sec – StandardDatasetAllocateStorage
    • If still running for less than 30 sec – StandardDatasetOptimize
    • If still running for less than 30 sec – StandardDatasetAggregate
    • Update LastOptimizationActionSuccessfulCompletionDateTime
    • Release lock

StandardDatasetProcessStaging

  • Get StandardDataset.StagingProcessorStoredProcedureName
  • Set lock – @DatasetId + ‘_Staging’
  • Execute Dataset-specific SP

StandardDatasetGroom

  • Set lock – @DatasetId + ‘_Groom’
  • Get StandardDataset.SchemaName, StandardDataset.RawInsertTableCount
  • Get 1 aggregation that require grooming (based on GroomingIntervalMinutes and LastGroomingDateTime, aggregations with smallest AggregationTypeId wins)
  • Calculate CutOff time
  • Groom insert tables
    • Get StandardDatasetAggregation.GroomStoredProcedureName
    • For each insert table:
      • Get TableGuid and TableNameSuffix from StandardDatasetTableMap
      • Execute groom SP for TableGuid
  • Groom optimized non-insert tables:
    • Get StandardDatasetTableMap.TableNameSuffix  for 1 table which is eligible for grooming
    • Drop dependent tables, if any. For each dependent table:
      • Get StandardDatasetAggregationStorage.BaseTableName
      • Drop table
    • Drop main table:
      • Get StandardDatasetAggregationStorage.BaseTableName
      • Drop table
    • Delete table map record
  • Update StandardDatasetAggregation.LastGroomingDateTime
  • Rebuild cover views – StandardDatasetBuildCoverView

StandardDatasetAllocateStorage

  • Set lock @DatasetId + ‘_TableMap’
  • Get StandardDataset.SchemaName
  • Calculate @ExpectedInsertTableCount
  • Get number of insert tables from StandardDatasetTableMap
  •  For each insert table:
    • Get TableGuid and TableNameSuffix from StandardDatasetTableMap
    • Get usage stats (sp_spaceused) – RowCount and Data size – for each table from the current table set (main table + dependent tables)
  • check if any of the tables is over the limit(s) (MaxTableRowCount, MaxTableSizeKb). If yes:
    • Raise lock to Exclusive
    • For each oversized table or while actual number of insert tables is less than expected:
      • Update StandardDatasetTableMap.InsertInd to zero (to restrict insertions)
      • Insert new record into StandardDatasetTableMap
      • For each table from the current table set (main table + dependent tables):
        • Get BaseTableName and TableTemplate from StandardDatasetAggregationStorage
        • Set correct file groups
        • Set table name Guid
        • Create table
        • Create indexes
          • Ensure PK is registered for optimization
          • For each index from StandardDatasetAggregationStorageIndex: Get index definition, Build CREATE INDEX statement, Create index, Indicate no optimization currently needed for this index
    • Rebuild cover views  – StandardDatasetBuildCoverView

StandardDatasetOptimize

  • Set lock – @DatasetId + ‘_Optimize’
  • Optimize domain tables  – DomainTableIndexOptimize
    • Same as below for dataset tables
  • Groom optimization history (StandardDatasetOptimizationHistory) – delete records that are older than 7 days
  • Add indexes for optimization (not optimized or defined for insert tables)
  • Check non-optimal non-insert tables if they are really non-optimal:
    • For each record in table map:
      • scroll through all registered indexes defined for all storage tables:
        • Check if index need reorg or rebuild
        • Check if the index wasn’t optimized during last 2 optimization sessions
        • Stop scrolling if non-optimal index found
        • Stop scrolling if outdated stats found
      • If all indexes are optimal, for non-insert tables:
        • Update StandardDatasetTableMap – set OptimizedInd, StartDateTime and EndDateTime (important: dates are based on [DateTime] of the main table!)
        • Create DateTime check constraint on the main table to ensure optimizer can use it
  • Optimize next index in queue (only 1 lengthy optimization at a time)
    • Decide method: online rebuild >> offline rebuild (if within blocking maintenance interval) >> reorg (insert tables only)
    • Execute
    • If optimization was successful – update history
    • Update stats if necessary

StandardDatasetAggregate

  • Set lock – @DatasetId + ‘_Aggregate’
  • Get lowest type of eggregation
  • Generate aggregation interval (based on StandardDataset.DefaultAggregationIntervalCount, StandardDatasetAggregation.AggregationIntervalDurationMinutes and StandardDatasetAggregation.AggregationStartDelayMinutes)
  • Generate aggregation intervals for larger types of aggregation
  • Add new records to the StandardDatasetAggregationHistory (daylight saving is respected)
  • Select interval to work on: oldest never aggregated >> oldest dirty
  • If interval exists:
    • Find and delete old aggregates (can be only in one table)
      • Get table params from StandardDatasetTableMap
      • Get StandardDatasetAggregation.DeleteAggregationStoredProcedureName
      • Execute dataset- and aggregation-specific SP
    • Get GUID for insert table (StandardDatasetGetInsertTableGuid)
    • Get StandardDatasetAggregation.BuildAggregationStoredProcedureName
    • Execute dataset- and aggregation-specific SP
    • Reset StandardDatasetTableMap.OptimizedInd if aggregation existed
    • Reallocate storage (StandardDatasetAllocateStorage, see above), if required
    • Update aggregation history
  • Release lock

StandardDatasetBuildCoverView

  • Get StandardDataset.SchemaName
  • For each table in a given Dataset-aggregation pair (StandardDatasetAggregationStorage):
    • Get BaseTableName and CoverViewSelectClause
    • Build CREATE VIEW statement
    • Build GRANT statement
    • Build ALTER VIEW statement (includes all tables from the table map)
    • Execute statements

Conclusion

SCOM Standard datasets are self-maintained and give universal functionality like aggregation handling, partitioning (even for SQL Server Standard Edition), grooming and maintenance. Standard datasets are defined in Management Packs. Thus, if existing out-of-box datasets for whatever reason do not provide required functionality, vendors can implement new datasets for different kinds of monitoring data.

One thought on “SCOM Standard Dataset maintenance workflow

  1. this is a great write-up thank you very much. One point worthy of mention is that the exchange 2010 Dataset does not fall into the standard dataset maintenance and therefore will keep collecting and building data in the data warehouse. If one is using the pre-sp version, the problem is compounded because it collects way too much data.

    we are scheduling the standard Dataset maintenance job against the exchange Dataset on a regular basis. MS support said the exchange 2010 dataset not being part of the grooming and partitioning is an unresolved issue.

Leave a Comment