Analyzing Database Design

While the initial dimension design is still on paper, you should review the design according to a set of guidelines. The guidelines help you fine-tune the database and leverage the multidimensional technology. The guidelines are processes or questions that help you achieve an efficient design and meet consolidation and calculation goals.

The number of members needed to describe a potential data point should determine the number of dimensions. If you are not sure whether you should delete a dimension, keep it and apply more analysis rules until you feel confident about deleting or keeping it.

Use the information in the following topics to analyze and improve your database design.

Dense and Sparse Dimensions

Which dimensions are sparse and which dense affects performance. For an introduction, see Sparse and Dense Dimensions. See Designing an Outline to Optimize Performance.

Standard and Attribute Dimensions

For simplicity, the examples in this topic show alternative arrangements for what initially was designed as two dimensions. You can apply the same logic to all combinations of dimensions.

Consider the design for a company that sells products to multiple customers over multiple markets; the markets are unique to each customer:

             Cust A  Cust B  Cust C
New York     100     N/A     N/A
Illinois     N/A     150     N/A
California   N/A     N/A     30

Cust A is only in New York, Cust B is only in Illinois, and Cust C is only in California. The company can define the data in one standard dimension:

Market
  New York
    Cust A
  Illinois
    Cust B
  California
    Cust C

However, if you look at a larger sampling of data, you may see that many customers can be in each market. Cust A and Cust E are in New York; Cust B, Cust M, and Cust P are in Illinois; Cust C and Cust F are in California. In this situation, the company typically defines the large dimension, Customer, as a standard dimension and the smaller dimension, Market, as an attribute dimension. The company associates the members of the Market dimension as attributes of the members of the Customer dimension. The members of the Market dimension describe locations of the customers.

Customer (Standard dimension)
  Cust A (Attribute:New York)
  Cust B (Attribute:Illinois)
  Cust C (Attribute:California)
  Cust E (Attribute:New York)
  Cust F (Attribute:California)
  Cust M (Attribute:Illinois)
  Cust P (Attribute:Illinois)
Market (Attribute dimension)
  New York
  Illinois
  California

Consider another situation. Again, the company sells products to multiple customers over multiple markets, but the company can ship to a customer that has locations in different markets:

             Cust A  Cust B  Cust C
New York     100      75     N/A
Illinois     N/A     150     N/A
California   150     N/A      30

Cust A is in New York and California. Cust B is in New York and Illinois. Cust C is only in California. Using an attribute dimension does not work in this situation; a customer member cannot have multiple attribute members. Therefore, the company designs the data in two standard dimensions:

Customer
  Cust A
  Cust B
  Cust C
Market
  New York
  Illinois
  California

Dimension Combinations

Break each combination of two dimensions into a two-dimensional matrix. For example, proposed dimensions at TBC (as listed in Table 2, TBC Sample Dimensions) include the following combinations:

  • Year across Measures

  • Year across Product

  • Year across Market

  • Year across Scenario

  • Measures across Product

  • Measures across Market

  • Measures across Scenario

  • Market across Product

  • Market across Scenario

  • Scenario across Product

  • Ounces across Pkg Type

Ounces and Pkg Type, as attribute dimensions associated with the Product dimension, can be considered with the Product dimension.

To help visualize each dimension, draw a matrix and include a few of the first-generation members. Figure 28, Analyzing Dimensional Relationships shows a simplified set of matrixes for three dimensions.

Figure 28. Analyzing Dimensional Relationships

This image shows a simplified set of matrixes for three dimensions, illustrating how to analyze dimensional relationships.

For each combination of dimensions, ask three questions:

  • Does it add analytic value?

  • Does it add utility for reporting?

  • Does it avoid an excess of unused combinations?

For each combination, the answers to the questions help determine whether the combination is valid for the database. Ideally, the answer to each question is yes. If not, consider rearranging the data into more-meaningful dimensions. As you work through this process, discuss information needs with users.

Repetition in Outlines

The repetition of elements in an outline often indicates a need to split dimensions. The following examples show you how to avoid repetition.

In Figure 29, Example of Eliminating Repetition By Creating a Scenario Dimension, the left column, labeled “Repetition,” shows Profit, Margin, Sales, COGS, and Expenses repeated under Budget and Actual in the Accounts dimension. The right column, labeled “No Repetition,” separates Budget and Actual into another dimension (Scenario), leaving just one set of Profit, Margin, Sales, COGS, and Expenses members in the Accounts dimension. This approach simplifies the outline and provides a simpler view of the budget and actual figures of the other dimensions in the database.

Figure 29. Example of Eliminating Repetition By Creating a Scenario Dimension

This image provides one solution to a repetition problem, as described in the text preceding the image.

In Figure 30, Example of Eliminating Repetition By Creating an Attribute Dimension, the left column, labeled “Repetition,” uses shared members in the Diet dimension to analyze diet beverages. Members 100–20, 200–20, and 300–20 are repeated: once under Diet, and once under their respective parents. The right column, labeled “No Repetition,” simplifies the outline by creating a Diet attribute dimension of type Boolean (True or False). All members are shown only once, under their respective parents, and are tagged with the appropriate attribute (“Diet: True” or “Diet: False”).

Figure 30. Example of Eliminating Repetition By Creating an Attribute Dimension

This image provides one solution to a repetition problem, as described in the text preceding the image.

Attribute dimensions also provide additional analytic capabilities. See Designing Attribute Dimensions.

Interdimensional Irrelevance

Interdimensional irrelevance occurs when many members of a dimension are irrelevant across other dimensions. Essbase defines irrelevant data as data that Essbase stores only at the summary (dimension) level. In such a situation, you may be able to remove a dimension from the database and add its members to another dimension or split the model into separate databases.

For example, TBC considered analyzing salaries as a member of the Measures dimension. But salary information often proves irrelevant in the context of a corporate database. Most salaries are confidential and apply to individuals. The individual and the salary typically represent one cell, with no reason to intersect with any other dimension.

TBC considered separating employees into a separate dimension. Table 4 shows an example of how TBC analyzed the proposed Employee dimension for interdimensional irrelevance. Members of the proposed Employee dimension (represented in the table header row) are compared with members of the Measures dimension (represented in the left-most column). The Measures dimension members (such as Revenue) apply to All Employees; only the Salary measure is relevant to individual employees.

Table 4. Example of Interdimensional Irrelevance

 

Joe Smith

Mary Jones

Mike Garcia

All Employees

Revenue

Irrelevance

Irrelevance

Irrelevance

Relevance

Variable Costs

Irrelevance

Irrelevance

Irrelevance

Relevance

COGS

Irrelevance

Irrelevance

Irrelevance

Relevance

Advertising

Irrelevance

Irrelevance

Irrelevance

Relevance

Salaries

Relevance

Relevance

Relevance

Relevance

Fixed Costs

Irrelevance

Irrelevance

Irrelevance

Relevance

Expenses

Irrelevance

Irrelevance

Irrelevance

Relevance

Profit

Irrelevance

Irrelevance

Irrelevance

Relevance

Reasons to Split Databases

Because individual employee information is irrelevant to the other information in the database, and also because adding an Employee dimension would substantially increase database storage needs, TBC created a separate Human Resources (HR) database. The new HR database contains a group of related dimensions and includes salaries, benefits, insurance, and 401(k) plans.

There are many reasons for splitting a database; for example, suppose that a company maintains an organizational database that contains several international subsidiaries in several time zones. Each subsidiary relies on time-sensitive financial calculations. You can split the database for groups of subsidiaries in the same time zone to ensure that financial calculations are timely. You can also use a partitioned application to separate information by subsidiary.

Checklist to Analyze the Database Design

Use the following checklist to analyze the database design:

  • Have you minimized the number of dimensions?

  • For each dimensional combination, did you ask:

    • Does it add analytic value?

    • Does it add utility for reporting?

    • Does it avoid an excess of unused combinations?

  • Did you avoid repetition in the outline?

  • Did you avoid interdimensional irrelevance?

  • Did you split the databases as necessary?