What’s wrong with optimized performance collection rules in SCOM?

I came across an interesting support case last week. The customer was asking why the optimization had been removed  from performance collection rules. Though I knew the answer, I spent some time on building the lab to showcase the impact of optimization. So, what does “optimization” means in terms of SCOM performance collection? Here is an excerpt from the MSDN article:

Unoptimized data results in a performance data item being returned once every specified polling interval. Optimized data results in a performance data item being returned only when the value delta has changed by the specified tolerance amount.

It clearly says that optimized performance collection rules will not return samples if they are more or less the same. It means that some samples will be literally filtered out and will not be written into both operational database and data warehouse. There are two implications from this kind of “optimization” – one good and one bad:

  1. Good – some space will be saved in both databases;
  2. Bad – There is no way to let data warehouse know that some samples have been collected, but not written, so aggregation logic will never know that a portion of data is missing. As a result, it will use only recorded values for calculation of hourly and daily aggregate values. As a result, aggregate values will be calculated incorrectly if optimization really works and suppresses samples.

Here is an illustration: two performance collection rules (one optimized and one raw) are collecting the same value. The value has a  constant value of 10 during first 30 minutes in each hour and a random value during last 30 minutes. Just take a look at how big is the difference between hourly aggregated values:

20140307_SCOM Performance Collection Optimized vs Raw

Conclusion: optimization of performance collection rules affects an accuracy of aggregated data in SCOM data warehouse. This is especially true for metrics that have periods of volatility and periods stability (due to physical limitations – like CPU Usage % cannot be higher than 100%, or due to workload seasoning – like business hours). Also, folks who are using data stored in the data warehouse for forecasting may get significantly incorrect predictions.

One thought on “What’s wrong with optimized performance collection rules in SCOM?

Leave a Comment