B Verifying Data Integrity with DBMS_SQLHASH

This appendix discusses verifying data integrity using the DBMS_SQLHASH package. It includes the following sections:

Overview of the DBMS_SQLHASH Package

The DBMS_SQLHASH package can check data integrity by making use of hash algorithms. It provides an interface to generate the hash value of the result set returned by a SQL query. Hash values are like data fingerprints and are used to ensure data integrity. DBMS_SQLHASH provides support for several industry standard hashing algorithms, including MD4, MD5, and SHA-1 cryptographic hashes.

Oracle Database installs the DBMS_SQLHASH package in the SYS schema. You can then grant package access to existing users and roles as required.

DBMS_SQLHASH includes the GETHASH function that is used to retrieve the hash value of a query result set. The GETHASH function runs one of the supported cryptographic hash algorithms against the result set of the SQL statement to arrive at a hash value.

You can compare hash values to check whether data has been altered. For example, before storing data, Laura runs the DBMS_SQLHASH.GETHASH function against the SQL statement to create a hash value of the SQL result set. When she retrieves the stored data at a later date, she reruns the hash function against the SQL statement using the same algorithm. If the second hash value is identical to the first one, then data has not been altered. Any modification to the result set data would cause the hash value to be different.

The DBMS_SQLHASH.GETHASH Function

This function applies one of the supported cryptographic hash algorithms to the result set of the SQL statement.

Syntax

DBMS_SQLHASH.GETHASH(
            sqltext IN varchar2,
                            digest_type IN BINARY_INTEGER,
    chunk_size IN number DEFAULT 134217728)
   RETURN raw;

Parameters

Table B-1 lists the GETHASH parameters and their descriptions.

Table B-1 GETHASH Function Parameters

Parameter Name Description

sqltext

The SQL statement whose result is hashed

digest_type

Hash algorithm used: HASH_MD4, HASH_MD5 or HASH_SH1

chunk_size

Size of the result chunk when getting the hash

When the result set size is large, the GETHASH function will break it into chunks having a size equal to chunk_size. It will generate the hash for each chunk and then use hash chaining to calculate the final hash. The default chunk_size is 128 MB.