9 SQL for Reporting and Analysis

This section discusses how to produce effective business reports derived from business queries, and includes the following sections:

Use of SQL Analytic Capabilities to Answer Business Queries

Oracle Database has enhanced SQL's analytical processing capabilities by introducing a family of aggregate and analytic SQL functions. These functions enable you to calculate ranking, percentiles, and moving averages, and allow you to answer queries such as the following:

  • What are the top 10 products sold by country?

  • What is the weekly moving average for products in stock?

  • What percentage of total sales occurs during the fourth quarter?

  • How much higher is the average discount in the fourth quarter than the discount for yearly average?

  • What would be the profitability ranking of existing oil refineries if 20 percent of the refineries in a country were closed?

Aggregate functions are a fundamental part of data warehousing because they enable you to derive different types of totals, and then use these totals for additional calculations. To improve aggregate performance in your data warehouse, Oracle Database provides several extensions to the GROUP BY clause. The CUBE, ROLLUP, GROUPING, and GROUPING SETS functions make querying and reporting easier and faster. The ROLLUP function calculates aggregations such as SUM, COUNT, MAX, MIN, and AVG at increasing levels of aggregation, from the most individual detailed level up to a grand summary total. The CUBE function is an extension similar to ROLLUP, enabling a single statement to calculate all possible combinations of aggregations.

Analytic functions compute an aggregate value based on a group of rows. These functions differ from aggregate functions in that they return multiple rows for each group. This group of rows is called a window. This window enables calculations such as moving average or cumulative total. For each row, a window of rows is defined. This window determines the range of rows used to perform the calculations for the current row. Window sizes can be based on either a logical interval such as time, or a physical number of rows. Some functions are used only with windows and are often referred to as window functions.

To enhance performance, aggregate and analytic functions can each perform in parallel: multiple processes can simultaneously execute all of these functions. These capabilities make calculations, analysis, and reporting easier and more efficient, thereby enhancing data warehouse performance, scalability, and simplicity.

You can take advantage of the advanced SQL and PL/SQL capabilities Oracle Database offers to translate business queries into SQL. This section discusses these advanced capabilities, and includes the following topics:

How to Add Totals to Reports Using the ROLLUP Function

The ROLLUP function enables a SELECT statement to calculate multiple levels of subtotals across a specified group of dimensions, as well as a grand total. The ROLLUP function is a simple extension to the GROUP BY clause, so its syntax is extremely easy to use. The ROLLUP function is highly efficient, adding minimal overhead to a query.The action of the ROLLUP function is straightforward: it creates subtotals that roll up from the most detailed level to a grand total, following a grouping list specified in the ROLLUP function. ROLLUP takes as its argument an ordered list of grouping columns. First, it calculates the standard aggregate values specified in the GROUP BY clause. Then, it creates progressively higher-level subtotals, moving from right to left through the list of grouping columns. Finally, it creates a grand total.

When to Use the ROLLUP Function

When your tasks involve subtotals, particularly when the subtotals are along a hierarchical dimension such as time or geography, use the ROLLUP function. Also, a ROLLUP function can simplify and speed up the maintenance of materialized views.

Example: Using the ROLLUP Function

A common request when preparing business reports is to find quarterly sales revenue across different product categories, in order by the amount of revenue. The following query achieves this, and is used for the starting point for building more complicated queries later:

SELECT t.calendar_quarter_desc quarter
, p.prod_category category
, TO_CHAR(SUM(s.amount_sold),'L999G999G990D00') revenue
FROM times t
, products p
, sales s
WHERE t.time_id = s.time_id
AND   p.prod_id = s.prod_id
AND   s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy') 
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY t.calendar_quarter_desc, p.prod_category
ORDER BY t.calendar_quarter_desc
, SUM(s.amount_sold);

QUARTER  CATEGORY                                  REVENUE
-------  ------------------------------            --------------
2001-01  Software/Other                               $860,819.81
2001-01  Electronics                                $1,239,287.71
2001-01  Hardware                                   $1,301,343.45
2001-01  Photo                                      $1,370,706.38
2001-01  Peripherals and Accessories                $1,774,940.09
2001-02  Software/Other                               $872,157.38
2001-02  Electronics                                $1,144,187.90
2001-02  Hardware                                   $1,557,059.59
2001-02  Photo                                      $1,563,475.51
2001-02  Peripherals and Accessories                $1,785,588.01
2001-03  Software/Other                               $877,630.85
2001-03  Electronics                                $1,017,536.82
2001-03  Photo                                      $1,607,315.63
2001-03  Hardware                                   $1,651,454.29
2001-03  Peripherals and Accessories                $2,042,061.04
2001-04  Software/Other                               $943,296.36
2001-04  Hardware                                   $1,174,512.68
2001-04  Electronics                                $1,303,838.52
2001-04  Photo                                      $1,792,131.39
2001-04  Peripherals and Accessories                $2,257,118.57

This query is useful, but you may want to see the totals for different categories in the same report. The following example illustrates how you can use the ROLLUP function to add the totals to the original query:

SELECT t.calendar_quarter_desc quarter
, p.prod_category category
, TO_CHAR(SUM(s.amount_sold),'L999G999G990D00') revenue
FROM times t
, products p
, sales s
WHERE t.time_id = s.time_id
AND   p.prod_id = s.prod_id
AND   s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy') 
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY ROLLUP(t.calendar_quarter_desc, p.prod_category)
ORDER BY t.calendar_quarter_desc
, SUM(s.amount_sold);
 
QUARTER   CATEGORY                                   REVENUE
-------   ------------------------------             -------------
2001-01   Software/Other                               $860,819.81
2001-01   Electronics                                $1,239,287.71
2001-01   Hardware                                   $1,301,343.45
2001-01   Photo                                      $1,370,706.38
2001-01   Peripherals and Accessories                $1,774,940.09
2001-01                                              $6,547,097.44
2001-02   Software/Other                               $872,157.38
2001-02   Electronics                                $1,144,187.90
2001-02   Hardware                                   $1,557,059.59
2001-02   Photo                                      $1,563,475.51
2001-02   Peripherals and Accessories                $1,785,588.01
2001-02                                              $6,922,468.39
2001-03   Software/Other                               $877,630.85
2001-03   Electronics                                $1,017,536.82
2001-03   Photo                                      $1,607,315.63
2001-03   Hardware                                   $1,651,454.29
2001-03   Peripherals and Accessories                $2,042,061.04
2001-03                                              $7,195,998.63
2001-04   Software/Other                               $943,296.36
2001-04   Hardware                                   $1,174,512.68
2001-04   Electronics                                $1,303,838.52
2001-04   Photo                                      $1,792,131.39
2001-04   Peripherals and Accessories                $2,257,118.57
2001-04                                              $7,470,897.52
                                                    $28,136,461.98

How to Separate Totals at Different Levels Using the CUBE Function

The CUBE function takes a specified set of grouping columns and creates subtotals for all of the possible combinations. In terms of multidimensional analysis, the CUBE function generates all the subtotals that can be calculated for a data cube with the specified dimensions. If you have specified CUBE(time, region, department), the result set will include all the values that would be included in an equivalent ROLLUP function plus additional combinations.

When to Use the CUBE Function

Consider using the CUBE function in any situation requiring cross-tabular reports. The data needed for cross-tabular reports can be generated with a single SELECT statement using the CUBE function. Like ROLLUP, the CUBE function can be helpful in generating materialized views. Note that population of materialized views is even faster if the query containing a CUBE function executes in parallel.

Example: Using the CUBE Function

You may want to get not only quarterly totals, but also totals for the different product categories for the selected period. The CUBE function enables this calculation, as shown in the following example.

To use the CUBE function:

SELECT t.calendar_quarter_desc quarter
, p.prod_category category
, TO_CHAR(SUM(s.amount_sold),'L999G999G990D00') revenue
FROM times t
, products p
, sales s
WHERE t.time_id = s.time_id
AND   p.prod_id = s.prod_id
AND   s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy') 
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY CUBE(t.calendar_quarter_desc, p.prod_category)
ORDER BY t.calendar_quarter_desc
, SUM(s.amount_sold);
 
QUARTER   CATEGORY                                   REVENUE
-------   ------------------------------             -------------
2001-01   Software/Other                               $860,819.81
2001-01   Electronics                                $1,239,287.71
2001-01   Hardware                                   $1,301,343.45
2001-01   Photo                                      $1,370,706.38
2001-01   Peripherals and Accessories                $1,774,940.09
2001-01                                              $6,547,097.44
2001-02   Software/Other                               $872,157.38
2001-02   Electronics                                $1,144,187.90
2001-02   Hardware                                   $1,557,059.59
2001-02   Photo                                      $1,563,475.51
2001-02   Peripherals and Accessories                $1,785,588.01
2001-02                                              $6,922,468.39
2001-03   Software/Other                               $877,630.85
2001-03   Electronics                                $1,017,536.82
2001-03   Photo                                      $1,607,315.63
2001-03   Hardware                                   $1,651,454.29
2001-03   Peripherals and Accessories                $2,042,061.04
2001-03                                              $7,195,998.63
2001-04   Software/Other                               $943,296.36
2001-04   Hardware                                   $1,174,512.68
2001-04   Electronics                                $1,303,838.52
2001-04   Photo                                      $1,792,131.39
2001-04   Peripherals and Accessories                $2,257,118.57
2001-04                                              $7,470,897.52
          Software/Other                             $3,553,904.40
          Electronics                                $4,704,850.95
          Hardware                                   $5,684,370.01
          Photo                                      $6,333,628.91
          Peripherals and Accessories                $7,859,707.71
                                                    $28,136,461.98

How to Add Subtotals Using the GROUPING Function

Two challenges arise with the use of the ROLLUP and CUBE functions. How can you programmatically determine which result set rows are subtotals, and how do you find the exact level of aggregation for a given subtotal? You often need to use subtotals in calculations such as percent-of-totals, so you need an easy way to determine which rows are the subtotals. What happens if query results contain both stored NULL values and null values created by a ROLLUP or CUBE function? How can you differentiate between the two?

The GROUPING function handles this problem. Using a single column as its argument, the GROUPING function returns 1 when it encounters a null value created by a ROLLUP or CUBE function. That is, if the null value indicates the row is a subtotal, GROUPING returns a value of 1. Any other type of value, including a stored NULL value, returns a value of 0.

When to Use the GROUPING Function

When you need to handle NULL values or null values created by a ROLLUP or CUBE operation, use the GROUPING function. One reason you may want to work with null values is to put a description in null fields, for example, text describing that a number represents a total.

Example: Using the GROUPING Function

You might want more descriptive columns in your report because it is not always clear when a value represents a total. The GROUPING function enables you to insert labels showing totals in the results of the query as shown in the following example.

To use the GROUPING function:

SELECT DECODE(GROUPING(t.calendar_quarter_desc)
             , 0, t.calendar_quarter_desc
             , 1, 'TOTAL'
             ) quarter
, DECODE(GROUPING(p.prod_category) 
        , 0,  p.prod_category
        , 1, 'TOTAL'
        ) category
, TO_CHAR(SUM(s.amount_sold),'L999G999G990D00') revenue
FROM times t
, products p
, sales s
WHERE t.time_id = s.time_id
AND   p.prod_id = s.prod_id
AND   s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy') 
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY CUBE(t.calendar_quarter_desc, p.prod_category)
ORDER BY t.calendar_quarter_desc
, SUM(s.amount_sold);
 
QUARTER   CATEGORY                           REVENUE
-------   ------------------------------     -------------
2001-01   Software/Other                       $860,819.81
2001-01   Electronics                        $1,239,287.71
2001-01   Hardware                           $1,301,343.45
2001-01   Photo                              $1,370,706.38
2001-01   Peripherals and Accessories        $1,774,940.09
2001-01   TOTAL                              $6,547,097.44
2001-02   Software/Other                       $872,157.38
2001-02   Electronics                        $1,144,187.90
2001-02   Hardware                           $1,557,059.59
2001-02   Photo                              $1,563,475.51
2001-02   Peripherals and Accessories        $1,785,588.01
2001-02   TOTAL                              $6,922,468.39
2001-03   Software/Other                       $877,630.85
2001-03   Electronics                        $1,017,536.82
2001-03   Photo                              $1,607,315.63
2001-03   Hardware                           $1,651,454.29
2001-03   Peripherals and Accessories        $2,042,061.04
2001-03   TOTAL                              $7,195,998.63
2001-04   Software/Other                       $943,296.36
2001-04   Hardware                           $1,174,512.68
2001-04   Electronics                        $1,303,838.52
2001-04   Photo                              $1,792,131.39
2001-04   Peripherals and Accessories        $2,257,118.57
2001-04   TOTAL                              $7,470,897.52
TOTAL     Software/Other                     $3,553,904.40
TOTAL     Electronics                        $4,704,850.95
TOTAL     Hardware                           $5,684,370.01
TOTAL     Photo                              $6,333,628.91
TOTAL     Peripherals and Accessories        $7,859,707.71
TOTAL     TOTAL                             $28,136,461.98

How to Combine Aggregates Using the GROUPING SETS Function

You can selectively specify the set of groups that you want to create using a GROUPING SETS function within a GROUP BY clause. This allows precise specification across multiple dimensions without computing the whole data cube. In other words, not all dimension totals are needed.

When to Use the GROUPING SETS Function

When you want particular subtotals in a data cube, but not all that are possible, use the GROUPING SETS function.

Example: Using the GROUPING SETS Function

You may want to see the total sales numbers based on sales channel. Instead of adding a separate query to retrieve the totals per channel class, you can use the GROUPING SETS function as illustrated in the following example.

To use the GROUPING SETS function:

SELECT DECODE(GROUPING(t.calendar_quarter_desc)
             , 0, t.calendar_quarter_desc
             , 1, 'TOTAL'
             ) quarter
, DECODE(GROUPING(c.channel_class)
        , 0, c.channel_class
        , 1 , '--all--'
        ) channel
, DECODE(GROUPING(p.prod_category)
        , 0,  p.prod_category
        , 1, 'TOTAL'
        ) category
, TO_CHAR(SUM(s.amount_sold),'L999G999G990D00') revenue
FROM times t
, products p
, channels c
, sales s
WHERE t.time_id = s.time_id
AND   p.prod_id = s.prod_id
AND   c.channel_id = s.channel_id
AND   s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy') 
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY GROUPING SETS(c.channel_class, 
    CUBE(t.calendar_quarter_desc, p.prod_category))
ORDER BY t.calendar_quarter_desc
, SUM(s.amount_sold);
 
QUARTER  CHANNEL       CATEGORY                     REVENUE
-------  ------------- ---------------------------- -------------
2001-01  --all--       Software/Other                 $860,819.81
2001-01  --all--       Electronics                  $1,239,287.71
2001-01  --all--       Hardware                     $1,301,343.45
2001-01  --all--       Photo                        $1,370,706.38
2001-01  --all--       Peripherals and Accessories  $1,774,940.09
2001-01  --all--       TOTAL                        $6,547,097.44
2001-02  --all--       Software/Other                 $872,157.38
2001-02  --all--       Electronics                  $1,144,187.90
2001-02  --all--       Hardware                     $1,557,059.59
2001-02  --all--       Photo                        $1,563,475.51
2001-02  --all--       Peripherals and Accessories  $1,785,588.01
2001-02  --all--       TOTAL                        $6,922,468.39
2001-03  --all--       Software/Other                 $877,630.85
2001-03  --all--       Electronics                  $1,017,536.82
2001-03  --all--       Photo                        $1,607,315.63
2001-03  --all--       Hardware                     $1,651,454.29
2001-03  --all--       Peripherals and Accessories  $2,042,061.04
2001-03  --all--       TOTAL                        $7,195,998.63
2001-04  --all--       Software/Other                 $943,296.36
2001-04  --all--       Hardware                     $1,174,512.68
2001-04  --all--       Electronics                  $1,303,838.52
2001-04  --all--       Photo                        $1,792,131.39
2001-04  --all--       Peripherals and Accessories  $2,257,118.57
2001-04  --all--       TOTAL                        $7,470,897.52
TOTAL    --all--       Software/Other               $3,553,904.40
TOTAL    --all--       Electronics                  $4,704,850.95
TOTAL    --all--       Hardware                     $5,684,370.01
TOTAL    --all--       Photo                        $6,333,628.91
TOTAL    Indirect      TOTAL                        $6,709,496.66
TOTAL    --all--       Peripherals and Accessories  $7,859,707.71
TOTAL    Others        TOTAL                        $8,038,529.96
TOTAL    Direct        TOTAL                       $13,388,435.36
TOTAL    --all--       TOTAL                       $28,136,461.98

How to Calculate Rankings Using the RANK Function

Business information processing requires advanced calculations, including complex ranking, subtotals, moving averages, and lead/lag comparisons. These aggregation and analysis tasks are essential in creating business intelligence queries, and are accomplished by the use of window functions.

When to Use the RANK Function

When you want to perform complex queries and analyze the query results, use the RANK function.

Example: Using the RANK Function

Users would like to see an additional column that shows the rank of any revenue number within the quarter. The following example illustrates using the RANK function to achieve this.

To use the RANK function:

SELECT DECODE(GROUPING(t.calendar_quarter_desc)
             , 0, t.calendar_quarter_desc
             , 1, 'TOTAL'
             ) quarter
, DECODE(GROUPING(t.calendar_quarter_desc) + GROUPING(p.prod_category)
        , 0, RANK() OVER (PARTITION BY t.calendar_quarter_desc 
    ORDER BY SUM(s.amount_sold))
        , 1, null
        ) ranking
, DECODE(GROUPING(c.channel_class)
        , 0, c.channel_class
        , 1 , '--all--'
        ) channel
, DECODE(GROUPING(p.prod_category)
        , 0,  p.prod_category
        , 1, 'TOTAL'
        ) category
, TO_CHAR(SUM(s.amount_sold),'L999G999G990D00') revenue
FROM times t
, products p
, channels c
, sales s
WHERE t.time_id = s.time_id
AND   p.prod_id = s.prod_id
AND   c.channel_id = s.channel_id
AND   s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy') 
 AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY GROUPING SETS(c.channel_class, 
CUBE(t.calendar_quarter_desc, p.prod_category))
ORDER BY t.calendar_quarter_desc
, SUM(s.amount_sold);
 
QUARTER  RANKING  CHANNEL    CATEGORY                       REVENUE
-------  -------  --------   ----------------------------   --------------
2001-01        1  --all--    Software/Other                    $860,819.81
2001-01        2  --all--    Electronics                     $1,239,287.71
2001-01        3  --all--    Hardware                        $1,301,343.45
2001-01        4  --all--    Photo                           $1,370,706.38
2001-01        5  --all--    Peripherals and Accessories     $1,774,940.09
2001-01           --all--    TOTAL                           $6,547,097.44
2001-02        1  --all--    Software/Other                    $872,157.38
2001-02        2  --all--    Electronics                     $1,144,187.90
2001-02        3  --all--    Hardware                        $1,557,059.59
2001-02        4  --all--    Photo                           $1,563,475.51
2001-02        5  --all--    Peripherals and Accessories     $1,785,588.01
2001-02           --all--    TOTAL                           $6,922,468.39
2001-03        1  --all--    Software/Other                    $877,630.85
2001-03        2  --all--    Electronics                     $1,017,536.82
2001-03        3  --all--    Photo                           $1,607,315.63
2001-03        4  --all--    Hardware                        $1,651,454.29
2001-03        5  --all--    Peripherals and Accessories     $2,042,061.04
2001-03           --all--    TOTAL                           $7,195,998.63
2001-04        1  --all--    Software/Other                    $943,296.36
2001-04        2  --all--    Hardware                        $1,174,512.68
2001-04        3  --all--    Electronics                     $1,303,838.52
2001-04        4  --all--    Photo                           $1,792,131.39
2001-04        5  --all--    Peripherals and Accessories     $2,257,118.57
2001-04           --all--    TOTAL                           $7,470,897.52
TOTAL             --all--    Software/Other                  $3,553,904.40
TOTAL             --all--    Electronics                     $4,704,850.95
TOTAL             --all--    Hardware                        $5,684,370.01
TOTAL             --all--    Photo                           $6,333,628.91
TOTAL             Indirect   TOTAL                           $6,709,496.66
TOTAL             --all--    Peripherals and Accessories     $7,859,707.71
TOTAL             Others     TOTAL                           $8,038,529.96
TOTAL             Direct     TOTAL                          $13,388,435.36
TOTAL             --all--    TOTAL                          $28,136,461.98

In this example, the PARTITION BY clause defines the boundaries for the RANK function.

How to Calculate Relative Contributions to a Total

A common business intelligence request is to calculate the contribution of every product category to the total revenue based on a given time period.

Example: Calculating Relative Contributions to a Total

You want to get the differences for revenue numbers on a quarter-by-quarter basis. As illustrated in the following example, you can use a window function with a PARTITION BY product category to achieve this.

To calculate relative contributions to a total:

SELECT DECODE(GROUPING(t.calendar_quarter_desc)
             , 0, t.calendar_quarter_desc
             , 1, 'TOTAL'
             ) quarter
, DECODE(GROUPING(t.calendar_quarter_desc) + GROUPING(p.prod_category)
        , 0, RANK() OVER (PARTITION BY t.calendar_quarter_desc 
 ORDER BY SUM(s.amount_sold))
        , 1, null
        ) RANKING
, DECODE(GROUPING(c.channel_class)
        , 0, c.channel_class
        , 1 , '--all--'
        ) channel
, DECODE(GROUPING(p.prod_category)
        , 0,  p.prod_category
        , 1, 'TOTAL'
        ) category
, TO_CHAR(SUM(s.amount_sold),'L999G999G990D00') revenue
, TO_CHAR(100 * RATIO_TO_REPORT(SUM(s.amount_sold)) 
OVER (PARTITION BY (TO_CHAR(GROUPING(p.prod_category) || 
 t.calendar_quarter_desc))),'990D0') percent
FROM times t
, products p
, channels c
, sales s
WHERE t.time_id = s.time_id
AND   p.prod_id = s.prod_id
AND   c.channel_id = s.channel_id
AND   s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy') 
   AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY GROUPING SETS(c.channel_class, 
      CUBE(t.calendar_quarter_desc, p.prod_category))
ORDER BY t.calendar_quarter_desc
, SUM(s.amount_sold);
 
QUARTER  RANKING  CHANNEL  CATEGORY         REVENUE         PERC
-------  -------  -------  -------------    ------------    ----
2001-01        1  --all--  Software/Other     $860,819.81   13.1
2001-01        2  --all--  Electronics      $1,239,287.71   18.9
2001-01        3  --all--  Hardware         $1,301,343.45   19.9
2001-01        4  --all--  Photo            $1,370,706.38   20.9
2001-01        5  --all--  Peripherals      $1,774,940.09   27.1
2001-01           --all--  TOTAL            $6,547,097.44  100.0
2001-02        1  --all--  Software/Other     $872,157.38   12.6
2001-02        2  --all--  Electronics      $1,144,187.90   16.5
2001-02        3  --all--  Hardware         $1,557,059.59   22.5
2001-02        4  --all--  Photo            $1,563,475.51   22.6
2001-02        5  --all--  Peripherals      $1,785,588.01   25.8
2001-02           --all--  TOTAL            $6,922,468.39  100.0
2001-03        1  --all--  Software/Other     $877,630.85   12.2
2001-03        2  --all--  Electronics      $1,017,536.82   14.1
2001-03        3  --all--  Photo            $1,607,315.63   22.3
2001-03        4  --all--  Hardware         $1,651,454.29   22.9
2001-03        5  --all--  Peripherals      $2,042,061.04   28.4
2001-03           --all--  TOTAL            $7,195,998.63  100.0
2001-04        1  --all--  Software/Other     $943,296.36   12.6
2001-04        2  --all--  Hardware         $1,174,512.68   15.7 
2001-04        3  --all--  Electronics      $1,303,838.52   17.5
2001-04        4  --all--  Photo            $1,792,131.39   24.0
2001-04        5  --all--  Peripherals      $2,257,118.57   30.2
2001-04           --all--  TOTAL            $7,470,897.52  100.0
TOTAL             --all--  Software/Other   $3,553,904.40   12.6
TOTAL             --all--  Electronics      $4,704,850.95   16.7
TOTAL             --all--  Hardware         $5,684,370.01   20.2
TOTAL             --all--  Photo            $6,333,628.91   22.5
TOTAL            Indirect  TOTAL            $6,709,496.66   11.9
TOTAL             --all--  Peripherals      $7,859,707.71   27.9
TOTAL              Others  TOTAL            $8,038,529.96   14.3
TOTAL              Direct  TOTAL           $13,388,435.36   23.8
TOTAL             --all--  TOTAL           $28,136,461.98   50.0

"Peripherals" was used instead of "Peripherals and Accessories" to save space.

How to Perform Interrow Calculations with Window Functions

A common business intelligence question is how a particular result relates to another result. To do this in a single query, you can use window functions and perform interrow calculations in a single statement.

Example: Performing Interrow Calculations

You may want to know the contribution of every product category to the total revenue for each quarter. You can use the window function RATIO_TO_REPORT to achieve this result, as illustrated in the following example. Note that you must use concatenation with GROUPING(p.prod_category) to preclude the total from the RATIO_TO_REPORT per quarter.

To perform interrow calculations:

SELECT DECODE(GROUPING(t.calendar_quarter_desc)
             , 0, t.calendar_quarter_desc
             , 1, 'TOTAL'
             ) quarter
, DECODE(GROUPING(t.calendar_quarter_desc) + GROUPING(p.prod_category)
        , 0, RANK() OVER (PARTITION BY t.calendar_quarter_desc 
ORDER BY SUM(s.amount_sold))
        , 1, null
        ) RANKING
, DECODE(GROUPING(c.channel_class)
        , 0, c.channel_class
        , 1 , '--all--'
        ) channel
, DECODE(GROUPING(p.prod_category)
        , 0,  p.prod_category
        , 1, 'TOTAL'
        ) category
, TO_CHAR(SUM(s.amount_sold),'L999G999G990D00') revenue
, TO_CHAR(100 * RATIO_TO_REPORT(SUM(s.amount_sold)) 
OVER (PARTITION BY (TO_CHAR(GROUPING(p.prod_category) || 
t.calendar_quarter_desc))),'990D0') percent
, DECODE(GROUPING(t.calendar_quarter_desc) + GROUPING(p.prod_category)
        , 0, TO_CHAR(SUM(s.amount_sold) - LAG(SUM(s.amount_sold),1) 
         OVER (PARTITION BY p.prod_category 
         ORDER BY t.calendar_quarter_desc),'L999G990D00')
        , 1, null
        ) q_q_diff
FROM times t
, products p
, channels c
, sales s
WHERE t.time_id = s.time_id
AND   p.prod_id = s.prod_id
AND   c.channel_id = s.channel_id
AND   s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy') 
   AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY GROUPING SETS(c.channel_class, 
      CUBE(t.calendar_quarter_desc, p.prod_category))
ORDER BY t.calendar_quarter_desc
, SUM(s.amount_sold);
 
QUARTER  RANKING  CHANNEL  CATEGORY         REVENUE         PERC     Q_Q_DIFF
-------  -------  -------  -------------    ------------    ----     ----------
2001-01        1  --all--  Software/Other     $860,819.81   13.1
2001-01        2  --all--  Electronics      $1,239,287.71   18.9
2001-01        3  --all--  Hardware         $1,301,343.45   19.9
2001-01        4  --all--  Photo            $1,370,706.38   20.9
2001-01        5  --all--  Peripherals      $1,774,940.09   27.1
2001-01           --all--  TOTAL            $6,547,097.44  100.0
2001-02        1  --all--  Software/Other     $872,157.38   12.6      $11,337.57
2001-02        2  --all--  Electronics      $1,144,187.90   16.5     -$95,099.81
2001-02        3  --all--  Hardware         $1,557,059.59   22.5     $255,716.14
2001-02        4  --all--  Photo            $1,563,475.51   22.6     $192,769.13
2001-02        5  --all--  Peripherals      $1,785,588.01   25.8      $10,647.92
2001-02           --all--  TOTAL            $6,922,468.39  100.0
2001-03        1  --all--  Software/Other     $877,630.85   12.2       $5,473.47
2001-03        2  --all--  Electronics      $1,017,536.82   14.1    -$126,651.08
2001-03        3  --all--  Photo            $1,607,315.63   22.3      $43,840.12
2001-03        4  --all--  Hardware         $1,651,454.29   22.9      $94,394.70
2001-03        5  --all--  Peripherals      $2,042,061.04   28.4     $256,473.03
2001-03           --all--  TOTAL            $7,195,998.63  100.0
2001-04        1  --all--  Software/Other     $943,296.36   12.6      $65,665.51
2001-04        2  --all--  Hardware         $1,174,512.68   15.7    -$476,941.61
2001-04        3  --all--  Electronics      $1,303,838.52   17.5     $286,301.70
2001-04        4  --all--  Photo            $1,792,131.39   24.0     $184,815.76
2001-04        5  --all--  Peripherals      $2,257,118.57   30.2     $215,057.53
2001-04           --all--  TOTAL            $7,470,897.52  100.0
TOTAL             --all--  Software/Other   $3,553,904.40   12.6
TOTAL             --all--  Electronics      $4,704,850.95   16.7
TOTAL             --all--  Hardware         $5,684,370.01   20.2
TOTAL             --all--  Photo            $6,333,628.91   22.5
TOTAL            Indirect  TOTAL            $6,709,496.66   11.9
TOTAL             --all--  Peripherals      $7,859,707.71   27.9
TOTAL              Others  TOTAL            $8,038,529.96   14.3
TOTAL              Direct  TOTAL           $13,388,435.36   23.8
TOTAL             --all--  TOTAL           $28,136,461.98   50.0

"Peripherals" was used instead of "Peripherals and Accessories" to save space.

How to Calculate a Moving Average Using a Window Function

You can create moving aggregations with window functions. A moving aggregation can be based on a number of physical rows or it can be a logical time period. Window functions use the PARTITION keyword, and, for each row in a partition, you can define a sliding window of data. This window determines the range of rows used to perform the calculations for the current row. Window sizes can be based on either a physical number of rows or a logical interval such as time. The window has a starting row and an ending row. Depending on its definition, the window may move at one or both ends. For instance, a window defined for a cumulative SUM function would have its starting row fixed at the first row of its partition, and its ending row would slide from the starting point all the way to the last row of the partition. In contrast, a window defined for a moving average would have both its starting and ending points slide so that they maintain a constant physical or logical range.

Window functions are commonly used to calculate moving and cumulative versions of SUM, AVERAGE, COUNT, MAX, MIN, and many more functions. They can be used only in the SELECT and ORDER BY clauses of the query. Window functions include the FIRST_VALUE function, which returns the first value in the window; and the LAST_VALUE function, which returns the last value in the window. These functions provide access to more than one row of a table without requiring a self-join.

Example: Calculating a Moving Average

The following example shows a query that retrieves a 7-day moving average of product revenue per product, using a logical time interval.

To calculate a moving average:

SELECT time_id
, prod_name
, TO_CHAR(revenue,'L999G990D00') revenue
, TO_CHAR(AVG(revenue) OVER (PARTITION BY prod_name ORDER BY time_id 
  RANGE INTERVAL '7' DAY PRECEDING),'L999G990D00') mv_7day_avg
FROM
( SELECT s.time_id, p.prod_name, SUM(s.amount_sold) revenue
  FROM products p
  , sales s
  WHERE p.prod_id = s.prod_id
  AND s.time_id BETWEEN TO_DATE('25-JUN-2001','dd-MON-yyyy')
                    AND TO_DATE('16-JUL-2001','dd-MON-yyyy')
  AND p.prod_name LIKE '%Memory%'
  AND p.prod_category = 'Photo'
  GROUP BY s.time_id, p.prod_name
)
ORDER BY time_id, prod_name;
 
TIME_ID     PROD_NAME           REVENUE             MV_7DAY_AVG
---------   -----------------   ------------------  --------------
26-JUN-01   256MB Memory Card              $560.15         $560.15
30-JUN-01   256MB Memory Card              $844.00         $702.08
02-JUL-01   128MB Memory Card            $3,283.74       $3,283.74
02-JUL-01   256MB Memory Card            $3,903.32       $1,769.16
03-JUL-01   256MB Memory Card              $699.37       $1,501.71
08-JUL-01   128MB Memory Card            $3,283.74       $3,283.74
08-JUL-01   256MB Memory Card            $3,903.32       $2,835.34
10-JUL-01   256MB Memory Card              $138.82       $1,580.50

Use of Partition Outer Join to Handle Sparse Data

Data is usually stored in sparse form. That is, if no value exists for a given combination of dimension values, no row exists in the fact table (the table in a data warehouse that contains the important facts, frequently sales). However, a reader of a business report may want to view the data in dense form, with rows for all combinations of dimension values displayed even when no fact table data exists for them. For example, if a product did not sell during a particular time period, you may still want to see the product for that time period with zero sales value next to it. Moreover, time series calculations can be performed most easily when data is dense along the time dimension. This is because dense data will fill a consistent number of rows for each period, which in turn makes it simple to use window functions with physical offsets.

Data densification is the process of converting sparse data into dense form. To overcome the problem of sparsity, you can use a partition outer join to fill the gaps in a time series or any dimension. Such a join extends the conventional outer join syntax by applying the outer join to each logical partition defined in a query. Oracle Database logically partitions the rows in your query based on the expression you specify in the PARTITION BY clause. The result of a partition outer join is a UNION operation of the outer joins of each of the partitions in the logically partitioned table with the table on the other side of the join. Note that you can use this type of join to fill the gaps in any dimension, not just the time dimension.

When to Use Partition Outer Join

When you want to fill in missing rows in a result set or perform time series calculations, use a partition outer join.

Example: Using Partition Outer Join

You may want to see how a particular product sold over the duration of a number of weeks. In this example, memory cards from the Photo category are used. Because these products are not sold very frequently, there may be weeks that a product is not sold at all. To make convenient comparisons, you have to make the data dense using the partition outer join as illustrated in the following example.

To use partition outer join:

SELECT tim.week_ending_day
, rev.prod_name product
, nvl(SUM(rev.amount_sold),0) revenue
FROM (SELECT p.prod_name, s.time_id, s.amount_sold
      FROM products p
      , sales s
      WHERE s.prod_id = p.prod_id
      AND p.prod_category = 'Photo'
      AND p.prod_name LIKE '%Memory%'
      AND s.time_id BETWEEN TO_DATE('25-JUN-2001','dd-MON-yyyy')
                        AND TO_DATE('16-JUL-2001','dd-MON-yyyy')
     ) rev
  PARTITION BY (prod_name)
  RIGHT OUTER JOIN (SELECT time_id, week_ending_day FROM times
                    WHERE week_ending_day 
                    BETWEEN TO_DATE('01-JUL-2001','dd-MON-yyyy') 
                    AND TO_DATE('16-JUL-2001','dd-MON-yyyy')
                   ) tim
  ON (rev.time_id = tim.time_id)
GROUP BY tim.week_ending_day
, rev.prod_name
ORDER BY tim.week_ending_day
, rev.prod_name;
 
WEEK_ENDI   PRODUCT                                   REVENUE
---------   ----------------------------------------  ----------
01-JUL-01   128MB Memory Card                                  0
01-JUL-01   256MB Memory Card                            1404.15
08-JUL-01   128MB Memory Card                            6567.48
08-JUL-01   256MB Memory Card                            8506.01
15-JUL-01   128MB Memory Card                                  0
15-JUL-01   256MB Memory Card                             138.82

Use of the WITH Clause to Simplify Business Queries

Queries that make extensive use of window functions as well as different types of joins and access many tables can become quite complex. The WITH clause allows you to eliminate much of this complexity by incrementally building up the query. It lets you reuse the same query block in a SELECT statement when it occurs more than once within a complex query. Oracle Database retrieves the results of a query block and stores them in the user's temporary tablespace.

When to Use the WITH Clause

When a query has multiple references to the same query block and there are joins and aggregations, use the WITH clause.

Example: Using the WITH Clause

Assume you want to compare the sales of memory card products in the Photo category for the first three week endings in July 2001. The following query takes into account that some products may not have sold at all in that period, and it returns the increase or decrease in revenue relative to the week before. Finally, the query retrieves the percentage contribution of the memory card sales for that particular week. Due to the use of the WITH clause, individual sections of the query are not very complex.

To use the WITH clause:

WITH sales_numbers AS
( SELECT s.prod_id, s.amount_sold, t.week_ending_day
  FROM sales s
  , times t
  , products p
  WHERE s.time_id = t.time_id
  AND s.prod_id = p.prod_id
  AND p.prod_category = 'Photo'
  AND p.prod_name LIKE '%Memory%'
  AND t.week_ending_day BETWEEN TO_DATE('01-JUL-2001','dd-MON-yyyy') 
                            AND TO_DATE('16-JUL-2001','dd-MON-yyyy')
)
, product_revenue AS 
( SELECT p.prod_name product, s.week_ending_day, SUM(s.amount_sold) revenue
  FROM products p
    LEFT OUTER JOIN (SELECT prod_id, amount_sold, week_ending_day 
                     FROM sales_numbers) s
    ON (s.prod_id = p.prod_id)
  WHERE p.prod_category = 'Photo'
  AND p.prod_name LIKE '%Memory%'
  GROUP BY p.prod_name, s.week_ending_day
)
, weeks AS
( SELECT distinct week_ending_day week FROM times WHERE week_ending_day
  BETWEEN TO_DATE('01-JUL-2001','dd-MON-yyyy') 
  AND TO_DATE('16-JUL-2001','dd-MON-yyyy')
)
, complete_product_revenue AS
( SELECT w.week, pr.product, nvl(pr.revenue,0) revenue
  FROM product_revenue pr
    PARTITION BY (product)
    RIGHT OUTER JOIN weeks w
    ON (w.week = pr.week_ending_day)
)
SELECT week
, product
, TO_CHAR(revenue,'L999G990D00') revenue
, TO_CHAR(revenue - lag(revenue,1) OVER (PARTITION BY product 
     ORDER BY week),'L999G990D00') w_w_diff
, TO_CHAR(100 * RATIO_TO_REPORT(revenue) OVER (PARTITION BY week),'990D0') percentage
FROM complete_product_revenue
ORDER BY week, product;
 
WEEK       PRODUCT                REVENUE      W_W_DIFF             PERCENT
---------  -----------------      -------      --------             -------
01-JUL-01  128MB Memory Card        $0.00                               0.0
01-JUL-01  256MB Memory Card    $1,404.15                             100.0
01-JUL-01   64MB Memory Card        $0.00                               0.0
08-JUL-01  128MB Memory Card    $6,567.48     $6,567.48                43.6
08-JUL-01  256MB Memory Card    $8,506.01     $7,101.86                56.4
08-JUL-01   64MB Memory Card        $0.00         $0.00                 0.0
15-JUL-01  128MB Memory Card        $0.00    -$6,567.48                 0.0
15-JUL-01  256MB Memory Card      $138.82    -$8,367.19               100.0
15-JUL-01   64MB Memory Card        $0.00         $0.00                 0.0