Cubes Property Sheet: Summarize To tab

Use this tab to identify the method to use for deciding which values are pre-calculated and stored in the analytic workspace, and which values are calculated on demand. You can choose from two types of summarization.

No Presummarization

Calculates all aggregate data at runtime.

Cost-Based Presummarization

Use this method to allow the OLAP engine to identify the most cost-effective areas of the data for presummarization. Use the sliders to identify the percentages, or type a number in the adjacent text box. You may want to adjust the percentages over time to balance runtime performance with maintenance restrictions on time and disk space. To tune your database, you should track both runtime performance and maintenance statistics at various percentages of presummarization.

For cubes with fewer than five dimensions, you may want to presummarize as much as 80 to 100% to get the best run-time performance, because the associated maintenance costs are relatively low. Do not drop below 20%, because querying times will degrade significantly.

For cubes with more than eight dimensions, you may want to precompute 50%. Compared to full materialization, 50% yields significant build-time improvements without significantly reducing query performance. Precomputing only 20% yields significant additional build-time improvements, but can significantly increase querying times.

When the cube is partitioned, you can set the top partition to a different percentage than the rest of the cube. The top partition is typically the largest partition, and it is the one initially queried by many applications. To improve runtime performance, you should increase this percentage if your build window permits it.

Level-Based Presummarization

For each dimension, select the levels that you want to precalculate. The base level should always be selected.

Dimension

Select each dimension to display its levels.

Levels

Select the levels you wish to calculate and store as part of the build process.

The best method for identifying levels for stored data is to determine the ratio of dimension members at each level, and to keep the ratio of members to be calculated on the fly at less than 10:1. This method assures that all answer sets can be returned quickly. Either the data is stored in the analytic workspace, or it can be calculated by rolling up 10 or fewer values into a single number. The time needed to roll up 10 values is trivial, and a well designed application will limit return sets to an amount of data that an analyst can scrutinize easily. You can modify this ratio using your judgment on how frequently a level is accessed.

Slower varying dimensions take longer to aggregate because the data is scattered throughout its storage space. If you are optimizing for data maintenance, then fully aggregate the faster varying dimensions and use skip-level aggregation on the slower varying dimensions.