Chapter 3. Databases

Table of Contents

Opening Databases
Deferred Write Databases
Closing Databases
Database Properties
Administrative Methods
Database Example

In Berkeley DB Java Edition, a database is a collection of records. Records, in turn, consist of key/data pairings.

Conceptually, you can think of a Database as containing a two-column table where column 1 contains a key and column 2 contains data. Both the key and the data are managed using DatabaseEntry class instances (see Database Records for details on this class ). So, fundamentally, using a JE Database involves putting, getting, and deleting database records, which in turns involves efficiently managing information encapsulated by DatabaseEntry objects. The next several chapters of this book are dedicated to those activities.

Note that on disk, databases are stored in sequentially numerically named log files in the directory where the opening environment is located. JE log files are described Databases and Log Files.

Opening Databases

You open a database by using the Environment.openDatabase() method (environments are described in Database Environments). This method creates and returns a Database object handle. You must provide Environment.openDatabase() with a database name.

You can optionally provide Environment.openDatabase() with a DatabaseConfig() object. DatabaseConfig() allows you to set properties for the database, such as whether it can be created if it does not currently exist, whether you are opening it read-only, and whether the database is to support transactions.

Note that by default, JE does not create databases if they do not already exist. To override this behavior, set the creation property to true.

Finally, if you configured your environment and database to support transactions, you can optionally provide a transaction object to the Environment.openDatabase(). Transactions are described in the Berkeley DB Java Edition Getting Started with Transaction Processing guide.

The following code fragment illustrates a database open:

package je.gettingStarted;

import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;

import java.io.File;
...

Environment myDbEnvironment = null;
Database myDatabase = null;

...

try {
    // Open the environment. Create it if it does not already exist.
    EnvironmentConfig envConfig = new EnvironmentConfig();
    envConfig.setAllowCreate(true);
    myDbEnvironment = new Environment(new File("/export/dbEnv"), envConfig);

    // Open the database. Create it if it does not already exist.
    DatabaseConfig dbConfig = new DatabaseConfig();
    dbConfig.setAllowCreate(true);
    myDatabase = myDbEnvironment.openDatabase(null, 
                                              "sampleDatabase", 
                                              dbConfig); 
} catch (DatabaseException dbe) {
    // Exception handling goes here
}

Deferred Write Databases

By default, JE's databases are all persistent. That is, the data they contain is stored on disk so that it can be accessed across program runs. However, it is possible to configure JE's databases so that they are by default not persistent. JE calls databases configured in this way to be deferred write databases.

Deferred write databases are essentially in-memory only databases. Therefore, they are particularly useful for applications that want databases which are truly temporary.

Note that deferred write databases do not always avoid disk I/O. It is particularly important to realize that deferred write databases can page to disk if the cache is not large enough to hold the database's entire contents. Therefore, deferred write database performance is best if your in-memory cache is large enough to hold the database's entire data-set.

Beyond that, you can deliberately cause data modifications made to a deferred write database to be made persistent. If Database.sync() is called before the application is shutdown, the contents of the deferred write database is saved to disk.

In short, upon reopening an environment and a deferred write database, the database is guaranteed to be in at least the same state it was in at the time of the last database sync. (It is possible that due to a full in-memory cache, the database will page to disk and so the database might actually be in a state sometime after the last sync.) This means that if the deferred write database is never sync'd, it is entirely possible for it to open as an empty database.

Because modifications made to a deferred write databases can be made persistent at a time that is easily controlled by the programmer, these types of databases are also useful for applications that perform a great deal of database modifications, record additions, deletions, and so forth. By delaying the data write, you delay the disk I/O. Depending on your workload, this can improve your data throughput by quite a lot.

Be aware that you lose modifications to a deferred write database only if you (1) do not call sync, (2) close the deferred write database and (3) also close the environment. If you only close the database but leave the environment opened, then all operations performed on that database since the time of the last environment open may be retained.

All other rules of behavior pertain to deferred write databases as they do to normal databases. Deferred write databases must be named and created just as you would a normal database. If you want to delete the deferred write database, you must remove it just as you would a normal database. This is true even if the deferred write database is empty because its name persists in the environment's namespace until such a time as the database is removed.

Note that determining whether a database is deferred write is a configuration option. It is therefore possible to switch a database between "normal" mode and deferred write database. You might want to do this if, for example, you want to load a lot of data to the database. In this case, loading data to the database while it is in deferred write state is faster than in "normal" state, because you can avoid a lot of the normal disk I/O overhead during the load process. Once the load is complete, sync the database, close it, and and then reopen it as a normal database. You can then continue operations as if the database had been created as a "normal" database.

To configure a database as deferred write, set DatabaseConfig.setDeferredWrite() to true and then open the database with that DatabaseConfig option.

For example, the following code fragment opens and closes a deferred write database:

package je.gettingStarted;

import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;

import java.io.File;
...

Environment myDbEnvironment = null;
Database myDatabase = null;

...

try {
    // Open the environment. Create it if it does not already exist.
    EnvironmentConfig envConfig = new EnvironmentConfig();
    envConfig.setAllowCreate(true);
    myDbEnvironment = new Environment(new File("/export/dbEnv"), envConfig);

    // Open the database. Create it if it does not already exist.
    DatabaseConfig dbConfig = new DatabaseConfig();
    dbConfig.setAllowCreate(true);
    // Make it deferred write
    dbConfig.setDeferredWrite(true);
    myDatabase = myDbEnvironment.openDatabase(null, 
                                              "sampleDatabase", 
                                              dbConfig); 

    ...
    // do work
    ...
    // Do this if you want the work to persist across
    // program runs
    // myDatabase.sync();

    // then close the database and environment here
    // (see the next section)

} catch (DatabaseException dbe) {
    // Exception handling goes here
}

Closing Databases

Once you are done using the database, you must close it. You use the Database.close() method to do this.

Closing a database causes it to become unusable until it is opened again. If any cursors are opened for the database, JE warns you about the open cursors, and then closes them for you. Active cursors during a database close can cause unexpected results, especially if any of those cursors are writing to the database in another thread. You should always make sure that all your database accesses have completed before closing your database.

Remember that for the same reason, you should always close all your databases before closing the environment to which they belong.

Cursors are described in Using Cursors later in this manual.

The following illustrates database and environment close:

import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Database;
import com.sleepycat.je.Environment;

...

try {
        if (myDatabase != null) {
            myDatabase.close();
        }

        if (myDbEnvironment != null) {
            myDbEnvironment.close();
        }
} catch (DatabaseException dbe) {
    // Exception handling goes here
}