Allocating Storage and Compressing Data

In This Section:

Storage Allocation

Data Compression

Storage Allocation

Essbase uses a data file to store data blocks. By default, a data file is located in its associated database folder. Data files follow the naming convention essn.pag, where n is greater than or equal to one and less than or equal to 65,535.

Essbase uses an index file to store the index for a database. By default, an index file is located in its associated database folder. Index files follow the naming convention essn.ind, where n is greater than or equal to 1 and less than or equal to 65,535.

Essbase automatically allocates storage for data and index files. You can use disk volumes to control how storage is allocated for these files.

*  To specify disk volumes so that you control how storage is allocated:

  1. Verify how much space Essbase uses to store index and data files. See Checking Index and Data File Sizes for information about how to check sizes.

  2. Choose a technique to control storage:

    • Specify which volumes (drives) Essbase uses to store these files. See Specifying Disk Volumes.

    • Install Essbase on one volume and store files on another.

Checking Index and Data File Sizes

*  To view index file (.ind file) and data file (.pag file) names, counts, sizes, and totals, and to determine whether each file is open in Essbase, use a tool:

Tool

Topic

Location

Administration Services

Checking Index and Data File Sizes

Oracle Essbase Administration Services Online Help

ESSCMD

LISTFILES

Oracle Essbase Technical Reference

Note:

The file size information that is provided by Windows for index and data files that reside on NTFS volumes may not be accurate. The file size information provided by Administration Services and by LISTFILES is accurate.

    Specifying Disk Volumes

    Use disk volumes to specify where you want to store Essbase index files (essn.ind) and data files (essn.pag). If you do not use the disk volumes setting, Essbase stores data only on the volume where the ARBORPATH directory resides. If the ARBORPATH variable is not set, Essbase stores data only on the volume where the server was started.

    Note:

    For information about how to check the size of the index and data files, see Checking Index and Data File Sizes.

    You can specify disk volumes using Administration Services, MaxL, or ESSCMD. When you use disk volumes, Essbase provides the following options for each:

    • Volume name

    • Maximum space to use on the volume (called Partition Size in Administration Services and Volume Size in ESSCMD)

    • File type. You can specify index files, data files, or both. The default is index and data files on the same volume.

    • Maximum file size. The default and recommended value is 2,097,152 KB (2 GB). When Essbase reaches the maximum file size, it creates a file and names it incrementally. For example, when ess00001.ind is filled to maximum size, Essbase creates ess00002.ind.

      Caution!

      If you specify a volume name but not a volume size, Essbase uses all available space on the volume.

    Essbase creates data files and index files in these situations:

    • If the total sizes of all files reach the maximum size that you specified in the disk volumes setting. By default, the total is the sum of all index and data file sizes. If you specify Index as the file type, the total refers to the sum of all index files on a volume. If you specify Data as the file type, the total refers to the sum of all data files on a volume.

      For example, suppose you want to use up to 12 GB for Essbase files on volume E, 16 GB on volume F, and 16 GB on volume G. Essbase creates a file on volume F when the sizes of the index and data files reach 12 GB on volume E and more data needs to be written out to disk.

    • If the size of an individual index or data file on any volume reaches 2 GB. In the above example, suppose volumes E and F have reached their capacities and Essbase is using volume G. Figure 151, Example of How Essbase Stores Files Across Volumes illustrates this example.

      On volume G, Essbase creates file ess00001.ind and fills it to the default limit of 2 GB. On volume G, Essbase creates file ess00001.pag and fills it to 1 GB.

      You have specified a limit of 16 GB on volume G, and you have used 3 GB. You have 13 GB left to use on volume G, but ess00001.ind has reached the maximum file size of 2 GB. The next time Essbase needs storage space when writing index files to disk, Essbase creates a file on volume G and names it ess00002.ind. Essbase then fills ess00002.ind to its 2 GB limit and creates ess00003.ind. Essbase follows the same procedures for data files.

      Figure 151. Example of How Essbase Stores Files Across Volumes

      Example of How Analytic Services Stores Files Across Volumes

    Essbase names files consecutively, starting with ess00001.xxx, where xxx is ind for an index file and pag for a data file, and continuing up to ess65535.xxx. This naming convention applies to each volume, so in the above example, volumes E, F, and G each have files named ess00001.pag and ess00001.ind.

    Keep in mind the following guidelines when specifying disk volumes:

    • Specify the disk volumes in the order in which you want the volumes to be used. You need not specify the volume on which Essbase is installed as one of the volumes; you can install on one volume and store data on other volumes.

    • If a volume reaches capacity, Essbase moves to the next volume.

    • If all specified volumes reach capacity, Essbase stops ongoing database operations, issues an error message, and performs fatal error handling. For more information, see Understanding Fatal Error Handling. If these events occur, shut down the database, allocate more disk space, and restart the database.

    • You can tell Essbase to stop storing files on a volume. Essbase can still access the volume as needed, but it no longer stores additional index and data information on the volume. To stop storing information on a volume, select the volume definition that you want to remove and click Delete.

    • You set disk volumes on a per-database basis. Multiple databases can use space on the same volume, so allocate space carefully. For example, if you specify 7 GB on Volume A for Database 1 and 7 GB on Volume A for Database 2, you have allocated 14 GB for Essbase files on Volume A.

    • For new files, changes to the disk volumes setting take effect when you next start the database. Existing files and volumes are not affected.

    Specifying Disk Volumes with Administration Services

    *  To specify disk volumes with Administration Services, see “Setting Disk Volumes” in Oracle Essbase Administration Services Online Help.

      Specifying Disk Volumes with ESSCMD

      *  To allocate a new volume, see the ESSCMD SETDBSTATEITEM 23 in the Oracle Essbase Technical Reference.

      ESSCMD prompts you for the number of new disk volumes you want to add, unless you supply the number on the command line.

      Then, for each new volume, ESSCMD prompts you for the following values, unless you supply them on the command line.

      • Volume name (for each volume)

      • Volume size (maximum space to use on the volume)—The default value is Unlimited; the minimum setting is 8 MB

        When you use ESSCMD, you can specify volume size in bytes (B), kilobytes (K), megabytes (M), gigabytes (G), or terabytes (T). ESSCMD displays minimum, maximum, and current values and 0 for unlimited.

        • File type—You can specify index files, data files, or both. The default is 3 - Index + Data (index and data files on the same volume).

        • File size (maximum size that each file specified in file type can attain before Essbase creates a file)—The default value is 2 GB; the minimum setting is 8 MB.

        The following example allocates up to 10 GB on Volume E, sets a maximum file size of 2 GB, and specifies that data files should be stored only on E:

        SETDBSTATEITEM 23 "SAMPLE" "BASIC" "1" "E" "10G" "2" "2G"

        *  To change the settings on an allocated volume, enter SETDBSTATEITEM 24 in ESSCMD and either follow the prompts or supply the required values on the command line.

          ESSCMD prompts you for the following values, unless you supply them on the command line:

          • Volume number. (Use the GETDBSTATE command in ESSCMD to see a list of the currently defined disk volumes and to see the number assigned to each volume.)

          • Volume name

          • Volume size

          • File type

          • File size

          The following example allocates up to 20 GB on Volume C and sets a maximum file size of 2 GB:

          SETDBSTATEITEM 24 "SAMPLE" "BASIC" "1" "C" "20G" "3" "2G"

          *  To stop Essbase from storing additional files on a volume, enter SETDBSTATEITEM 25 in ESSCMD and either follow the prompts or supply the required values on the command line. Essbase continues accessing files on the deallocated volume but does not write new files to it.

          ESSCMD prompts you for the following value, unless you supply it on the command line—Delete which volume definition. Use the GETDBSTATE command in ESSCMD to see a list of the currently defined disk volumes and to see the number assigned to each volume.

            The following example deallocates the volume that is specified as fourth:

            SETDBSTATEITEM 25 "SAMPLE" "BASIC" "4"

            Note:

            If you delete an application or database, Essbase does not remove the directory containing the application or database on a disk volume. The computer's operating system still shows the folder and file labels on the disk. However, you can reuse the same name of the application or database that you had removed on the disk volume.

            For more syntax information, see the Oracle Essbase Technical Reference.

            On UNIX, volume_name is a mounted UNIX file system. You must enter a fully qualified pathname to the Essbase installation directory (ESSBASEPATH). Essbase automatically appends the app directory to the path; you do not specify the app directory.

            Consider the following example:

            /vol2/EssbaseServer 10M

            Volume size is the maximum space, in KB, allocated to the volume. The default value is unlimited—Essbase uses all available space on that volume.

            Reviewing an Example of Specifying Volumes to Control Storage

            Assume you want to use up to 20 GB for Essbase files on Volume E, 25 GB on Volume F, and 25 GB on Volume G. You are using the default file size limit of 2 GB.

            When you load data, Essbase stores up to 20 GB on Volume E; if the database is larger than 20 GB, Essbase stores the next 25 GB on Volume F, and so on.

            Figure 152. Example of Using Disk Volumes

            Example of Using Disk Volumes

            Data Compression

            Essbase allows you to choose whether data blocks that are stored on disk are compressed, as well as which compression scheme to use. When data compression is enabled, Essbase compresses data blocks when it writes them out to disk. Essbase fully expands the compressed data blocks, including empty cells, when the blocks are swapped into the data cache.

            Generally, data compression optimizes storage use. You can check compression efficiency by checking the compression ratio statistic. See Checking the Compression Ratio for a review of methods.

            Essbase provides several options for data compression:

            • Bitmap compression, the default. Essbase stores only nonmissing values and uses a bitmapping scheme.

            • Run-length encoding (RLE). Essbase compresses repetitive, consecutive values, including zeros and #MISSING values.

            • zlib compression. Essbase builds a data dictionary based on the actual data being compressed.

            • Index Value Pair compression. Essbase applies this compression if the block density is less than 3%.

            • No compression. Essbase does not compress data blocks when they are written to disk.

            Because Essbase compresses data blocks as they are written to disk, it is possible for bitmap, RLE, and uncompressed data blocks to coexist in the same data file. Keep in mind the following rules:

            • When a compressed data block is brought into the data cache, Essbase expands the block to its full size, regardless of the scheme that was used to compress it.

            • When Essbase stores a block on disk, Essbase treats the block the same whether it was compressed or uncompressed when it was brought into the data cache. In either case, Essbase compresses the block according to the specified compression type (including not compressing it if no compression is specified).

            • If compression is not enabled, Essbase writes out the fully expanded block to disk.

            You may want to disable data compression if blocks have very high density (90% or greater) and have few consecutive, repeating data values. Under these conditions, enabling compression consumes resources unnecessarily.

            Bitmap Data Compression

            With bitmap compression, Essbase uses a bitmap to represent data cells and stores only the bitmap, the block header, and the other control information. A bitmap uses one bit for each cell in the data block, whether the cell value is missing or nonmissing. When a data block is not compressed, Essbase uses 8 bytes to store every nonmissing cell.

            When using bitmap compression, Essbase stores only nonmissing values and does not compress repetitive values or zeros (contrast with RLE compression, described in RLE Data Compression). When Essbase places a data block into the data cache, it fully expands the data block, using the bitmap to recreate the missing values.

            Because the bitmap uses one bit for each cell in the data block, the bitmap scheme provides a fixed overhead for data compression. Figure 153, Bitmap Data Compression represents a portion of a data block as an example. In this example, Essbase uses 64 bytes to store the data in the fully expanded block but uses 1 byte (8 bits) to store the bitmap of the compressed data on disk. (Essbase also uses a 72-byte block header for each block, whether or not the block is compressed.)

            Figure 153. Bitmap Data Compression

            Bitmap Data Compression

            In most cases, bitmap compression conserves disk space more efficiently. However, much depends on the configuration of the data.

            RLE Data Compression

            When using the run-length encoding (RLE) compression scheme, Essbase compresses any consecutive, repetitive values—any value, including zero, that repeats three or more times consecutively. Essbase tracks each repeating value and the number of times it repeats consecutively.

            In the example in Figure 154, RLE Data Compression, Essbase uses 64 bytes to store the data in the fully expanded block but uses 56 bytes to store the compressed data on disk. (Essbase also uses a 72-byte block header for each block, whether or not the block is compressed.)

            Figure 154. RLE Data Compression

            RLE Data Compression

            zlib Compression

            This method is used in packages such as PNG, Zip, and gzip. Calculation and data loading are faster with direct I/O and zlib compression than with buffered I/O and zlib compression. If data storage is your greatest limiting factor, use zlib, but be aware that, under some circumstances, data loads may be up to 10% slower than bitmap compression. The size of the database, however, is generally significantly smaller when you use zlib compression.

            In contrast to bitmap compression, which uses an algorithm to track which values are missing and does not interact with any other type of data, zlib compression builds a data dictionary based on the actual data being compressed (including any missing values). Therefore, zlib compression should provide greater compression ratios over bitmap compression, given extremely dense data. However, because the effectiveness of the zlib algorithm is dependent (at the bit level) on the actual data being compressed, general guidelines about when zlib compression provides greater compression than bitmap compression based solely on density are not available. Unlike other compression methods, the storage space saved has little or no relationship to the number of missing cells or the number of contiguous cells of equal value. Generally, the more dense or heterogeneous the data is, the better zlib will compress it in comparison to bitmap or RLE compression. However, under some circumstances, it is possible that zlib will not yield better results than using bitmap or RLE compression. It is best to test with a representative data sample.

            To estimate the storage savings you may obtain with zlib, create a small database using your usual compression technique (bitmap or RLE) with a small sampling of real data and shut down Essbase Server. Note the size of the created data files. Then clear the data in the sample database, change the compression setting to zlib, reload the same sample data, and shut down Essbase Server again. Now note the difference in the storage used. You can also use the small sample database to estimate any changes in calculation or data loading speed.

            Index Value Pair Compression

            Index Value Pair addresses compression on databases with larger block sizes, where the blocks are highly sparse. This compression algorithm is not selectable but is automatically used whenever appropriate by the database. The user must still choose between the compression types None, bitmap, RLE, and zlib through Administration Services.

            For example, if the user selects RLE, Essbase reviews each block and evaluates the following compression types for highest compression: RLE, bitmap, or Index Value Pair. If the user chooses zlib, for example, zlib is the only compression type applied.

            The following table illustrates the available compression types the user can choose and the compression types that Essbase evaluates and then applies.

            Chosen Compression Type

            Evaluated Compression Type

            None

            None

            Bitmap

            Bitmap, Index Value Pair

            RLE

            RLE, Bitmap, Index Value Pair

            zlib

            zlib

            Deciding Which Compression Type to Use

            You can choose from four compression settings: bitmap (the default), RLE, zlib, or None.

            In most cases, you need not worry about choosing a setting. Bitmap compression almost always provides the best combination of fast performance and small data files. However, much depends on the configuration of the data.

            Data compression is CPU-intensive. Consider the tradeoffs of computation costs versus I/O costs and disk space costs when choosing a compression setting.

            In general, a database compresses better using the RLE setting than the bitmap setting if a large number of repeated nonmissing data cells for a given block have the same value. Using RLE compression is computationally more expensive than using bitmap compression. If your database shrinks significantly using RLE compression, however, you may see a performance improvement due to decreased I/O costs.

            Databases usually shrink when using zlib compression, but not always. Using zlib compression significantly increases CPU processing. For most databases, this extra processing outweighs the benefits of the decreased block size. But if your database shrinks significantly using zlib compression, you may see a performance improvement due to decreased I/O costs.

            The None compression setting does not reduce the disk usage of a database compared to bitmap compression. In fact, no compression may make no difference to improve the performance of the database, because bitmap compression is extremely fast.

            Remember that each database is unique, and the previous statements are general characteristics of compression types. Although the default bitmap compression works well for most databases, the best way to determine the best compression setting for your database is to try each one.

            Changing Data Compression Settings

            Changes to the data compression setting take effect immediately as Essbase writes data blocks to disk. For blocks already on disk, Essbase does not change compression schemes or enable or disable compression. When you change the data compression settings of blocks already on disk, Essbase uses the new compression scheme the next time Essbase accesses, updates, and stores the blocks.

            *  To view or change the current settings, use a tool:

            Tool

            Topic

            Location

            Administration Services

            Selecting a Data Compression Method

            Oracle Essbase Administration Services Online Help

            MaxL

            alter database

            Oracle Essbase Technical Reference

            ESSCMD

            To enable or disable data compression: SETDBSTATE

            or:

            SETDBSTATEITEM 14

            To set the data compression type: SETDBSTATEITEM 15

            Oracle Essbase Technical Reference

              Example of Using SETDBSTATEITEM

              *  To enable or disable data compression, enter SETDBSTATEITEM 14 in ESSCMD and either follow the prompts or supply the required values on the command line.

                ESSCMD prompts you for the following values, unless you supply them on the command line:

                • Data Compression on Disk? Enter Y (Yes, the default) or N (No).

                • Data Compression Type. Enter 1 (run-length encoding) or 2 (bitmap, the default).

                *  To specify the data compression type, enter SETDBSTATEITEM 15 in ESSCMD and either follow the prompts or supply the required values on the command line. ESSCMD prompts you for a value of “1” (run length encoding) or “2” (bitmap, the default).

                  The following example enables Bitmap compression:

                  SETDBSTATEITEM 14 "SAMPLE" "BASIC" "Y" "2"

                  For more syntax information, see the Oracle Essbase Technical Reference.

                  Checking the Compression Ratio

                  The compression ratio represents the ratio of the compressed block size (including overhead) to the uncompressed block size, regardless of the compression type in effect. Overhead is the space required by mechanisms that manage compression/expansion.

                  *  To check the compression ratio, use a tool:

                  Tool

                  Topic

                  Location

                  Administration Services

                  Checking the Compression Ratio

                  Oracle Essbase Administration Services Online Help

                  ESSCMD

                  GETDBSTATS

                  Oracle Essbase Technical Reference

                  Note:

                  The larger the number, the more compression. The compression ratio can vary widely from block to block.

                    Data Block Size

                    Data block size is determined by the amount of data in a particular combination of dense dimensions. For example, when you change the dense or sparse configuration of one or more dimensions in the database, the data block size changes. Data block size is 8n bytes, where n is the number of cells that exist for that combination of dense dimensions.

                    Note:

                    The optimum size range is 8 KB–100 KB.

                    For information about determining the size of a data block, see Size of Expanded Data Block.

                    *  To view the block size for a database, use a tool:

                    Tool

                    Topic

                    Location

                    Administration Services

                    Checking Data Block Statistics

                    Oracle Essbase Administration Services Online Help

                    ESSCMD

                    GETDBSTATS

                    Oracle Essbase Technical Reference