Berkeley DB Reference Guide: Transaction Protected Applications ee,hash,hashing,transaction,transactions,locking,logging,access method,access me thods,java,C,C++">

Berkeley DB Reference Guide: Transaction Protected Applications

Introduction

So far, this Reference Guide has discussed how applications can use the Berkeley DB Access Methods to store and retrieve data items to and from database files. This chapter will discuss how to write applications to include transactional support. First, there are a few terms that will be helpful:

System or application failure

This is the phrase that we will use to describe when something bad happens near your data. It can be an application dumping core, being interrupted by a signal, the disk filling up, or the entire system crashing. In any case, for whatever reason, the application can no longer continue to make forward progress.

Transaction

A transaction is a group of changes, to one or more databases, that should be treated as a single unit of work. That is, either all of the changes must be applied to the database(s) or none of them should. The application specifies when each transaction starts, what database changes are included in it, and when it ends.

Transaction commit

Every transaction ends by committing or aborting. If a transaction commits, then Berkeley DB guarantees that the database changes inside it will never be lost, even after system or application failure. If a transaction aborts, or is unfinished when the system or application fails, then the changes involved will never appear in the database.

Deadlock

Deadlock, in its simplest form, happens when one thread of control owns resource A, but needs resource B, while another thread of control owns resource B, but needs resource A. Neither thread of control can make progress, and so one has to give up and release all of its resources, at which point the other can again make forward progress.

Recovery

Whenever system or application failure occurs, the application must run recovery. Recovery is what makes the database consistent, i.e., reviews the log files and the databases to ensure that the changes from each committed transaction appear in the database, and that none of the changes from any unfinished (or aborted) transactions do.

To run recovery, all applications in the environment must exit Berkeley DB. (Normally, and always, of course, in the case of system failure, the applications will stop running entirely.)

Once none of the applications are still using the database environment, recovery is performed by calling the Berkeley DB interface with special flags (either DB_RECOVER or DB_RECOVER_FATAL).

When recovery is complete, the database environment is again available for normal use.

Write-ahead-logging

This is a term that describes the underlying implementation that Berkeley DB uses to ensure recovery. What it means is that before any change is made to a database, information about the change is written to the database log. During recovery, the log is read, and databases are checked to ensure that changes described in the log for committed transactions appear in the database. Changes that appear in the database but which are related to aborted or unfinished transactions in the log are undone from the database.

There are a number of reasons for using transactional support in your programs. The most common are:

Recoverability: Applications often need to ensure that, no matter how the system or application fails, previously saved data is available the next time the application runs.
Deadlock avoidance: When multiple threads of control are changing the database at the same time, there is usually the possibility of deadlock. The way this is resolved is that one of the transactions involved has to release the resources it owns so that the other one can proceed. (The transaction that releases its resources is usually just tried again later.) However, using transactions is necessary so that any changes that have already been made to the database can be undone, so that the database doesn't end up corrupted.
Atomicity: Applications often need to make multiple changes, but want to ensure that either all of the changes happen, or none of them. Transactions guarantee that a group of changes are atomic, i.e., that if the application or system fails, either all of the changes will appear when the application next runs, or none of them.
Repeatable reads: Applications sometimes need to ensure that, while doing a group of operations on a database, that the value returned as a result of a database retrieval doesn't change, i.e., if you retrieve the same key more than once, the data item will be the same each time. Transactions guarantee this behavior.