JE Lock Management

Managing JE Lock Timeouts
Managing Deadlocks and other Lock Conflicts

To manage locks in JE, you must do two things:

  1. Manage lock timeouts.

  2. Detect and respond to lock conflicts. Conceptually, these are deadlocks. But from a coding point of view there is no difference between what you do if a lock times out, and what you do if you encounter a deadlock. In fact, in JE, you cannot tell the difference based on the exceptions that are thrown.

Managing JE Lock Timeouts

Like transaction timeouts (see Configuring the Transaction Subsystem), JE allows you to identify the longest period of time that it is allowed to hold a lock. This value plays an important part in performing deadlock detection, because the only way JE can identify a deadlock is if a lock is held past its timeout value.

However, unlike transaction timeouts, lock timeouts are on a true timer. Transaction timeouts are only identified when JE is has a reason to examine its lock table; that is, when it is attempting to acquire a lock. If no such activity is occurring in your application, a transaction can exist for a long time past its expiration timeout. Conversely, lock timeouts are managed by a timer maintained by the JVM. Once this timer has expired, your application will be notified of the event (see the next section on deadlock detection for more information).

You can set the lock timeout on a transaction by transaction basis, or for the entire environment. To set it on a transaction basis, use Transaction.setLockTimeout(). To set it for your entire environment, use EnvironmentConfig.setLockTimeout() or use the je.lock.timeout parameter in the je.properties file.

The value that you specify for the lock timeout is in microseconds. 500000 is used by default.

Note that changing this value can have an affect on your application's performance. If you set it too low, locks may expire and be considered deadlocked even though the thread is in fact making forward progress. This will cause your application to abort and retry transactions unnecessarily, which can ultimately harm application throughput. If you set it too high, threads may deadlock for too long before your application receives notification and is able to take corrective action. Again, this can harm application throughput.

Note that for applications in which you will have extremely long-lived locks, you may want to set this value to 0. Doing so disables lock timeouts entirely. Be aware that disabling lock timeouts can be dangerous because then your application will never be notified of deadlocks. So, alternatively, you might want to set this value to a very large timeout (such as ten minutes) if your application is using extremely long-lived locks.

Managing Deadlocks and other Lock Conflicts

A deadlock is the result of a lock conflict that cannot be resolved by the underlying JE code before the lock times out. Generically, we consider this situation a lock conflict because there is no way to tell if the lock timed out because of a true deadlock, or if it timed out because a long-running operation simply held the lock for too long a period of time.

When a lock conflict occurs in JE, the thread of control holding that lock is notified of the event using a LockConflictException exception. Note that this exception is actual a common base class for several exception classes that might be able to give you more of a hint as to what the actual problem is. However, the response that you make for any of these exceptions is probably going to be the same, so the best thing to do is simply catch and manage LockConflictException.

When a LockConflictException is thrown, the thread must:

  1. Cease all read and write operations.

  2. Close all open cursors.

  3. Abort the transaction.

  4. Optionally retry the operation. If your application retries operations that are aborted due to a lock conflict, the new attempt must be made using a new transaction.

Note

If a thread has encountered a lock conflict, it may not make any additional database calls using the transaction handle that has experienced the lock conflict.

For example:

// retry_count is a counter used to identify how many times
// we've retried this operation. To avoid the potential for 
// endless looping, we won't retry more than MAX_DEADLOCK_RETRIES 
// times.

// txn is a transaction handle.
// key and data are DatabaseEntry handles. Their usage is not shown here.
while (retry_count < MAX_DEADLOCK_RETRIES) {
    try {
        txn = myEnv.beginTransaction(null, null);
        myDatabase.put(txn, key, data);
        txn.commit();
        return 0;
    } catch (LockConflictException le) {
        try {
            // Abort the transaction and increment the
            // retry counter
            if (txn != null) {
                txn.abort();
            }
            retry_count++;
            if (retry_count >= MAX_DEADLOCK_RETRIES) {
                System.err.println("Exceeded retry limit. Giving up.");
                return -1;
            }
        } catch (DatabaseException ae) {
            System.err.println("txn abort failed: " + ae.toString());
            return -1;    
        }
    } catch (DatabaseException e) {
        // If we catch a generic DatabaseException instead of
        // a LockConflictException, we simply abort and give
        // up -- we don't retry the operation.
        try {
            // Abort the transaction.
            if (txn != null) {
                txn.abort();
            }
        } catch (DatabaseException ae) {
            System.err.println("txn abort failed: " + ae.toString());
        }
        return -1;    
    }
}