Berkeley DB Reference Guide: Transaction Protected Applications ee,hash,hashing,transaction,transactions,locking,logging,access method,access me thods,java,C,C++">

Berkeley DB Reference Guide: Transaction Protected Applications

Archival procedures

The third component of the infrastructure, archival procedures, concerns the recoverability of the database and the disk consumption of the database log files.

There are two separate aspects to these issues. First, you may want to periodically create snapshots (i.e., backups) of your databases to make it possible to recover them from catastrophic failure. Second, you'll want to periodically remove log files in order to conserve disk space. The two procedures are distinct from each other, and you cannot remove the current log files simply because you have created a database snapshot.

Archival for recovery:

To create a snapshot of your database that can be used to recover from catastrophic failure, the following steps should be taken:

Run db_archive -s to identify all of the database data files that must be saved, and copy them to a backup device, (e.g., tape). If the database files are stored in a separate directory from the other Berkeley DB files, it may be simpler to archive the directory itself instead of the individual files.
More importantly, if any of the database files have not been accessed during the lifetime of the current log files, db_archive will not list them in its output! For this reason, it may be important to use a separate database file directory, archiving it instead of the files listed by db_archive.
If your database is currently active, i.e., you are reading and writing to the database files while the snapshot is being taken, run db_archive -l to identify the database log files that you must save, and copy them to a backup device, (e.g., tape). If the database log files are stored in a separate directory from the other database files, it may be simpler to archive the directory itself instead of the individual files.

Note that the order of these operations is important, and that the database files must be archived before the log files.

The Berkeley DB library supports on-line backups, and it is not necessary to stop reading or writing your databases during the time when you create this snapshot. It is important to note, however, that the snapshot of an active database will be consistent as of some unspecified time between the start of the archival and when archival is completed.

To create a snapshot as of a specific time, you must stop reading and writing your databases for the entire time of the archival, force a checkpoint (see db_checkpoint), and then archive the files listed by the db_archive utility's -s and -l options.

Once these steps are completed, your database can be recovered from catastrophic failure to its state as of the time the archival was done. To update your snapshot so that recovery from catastrophic failure is possible up to a new point in time, repeat step #2, copying all existing log files to a backup device.

Each time that a complete snapshot is made, i.e. all database and log files are copied to backup media, you may discard all previous snapshots and saved log files.

The time to restore from catastrophic failure is a function of the number of log records that have been written since the snapshot was originally created. Perhaps more importantly, the more separate pieces of backup media you use, the more likely that you will have a problem reading from one of them. For these reasons, it is often best to make snapshots on a regular basis.

For archival safety remember to ensure that you have multiple copies of your database backups, that you verify that your archival media is error-free, and that copies of your backups are stored off-site!

To restore your database after catastrophic failure, the following steps should be taken:

Restore the copies of the database files from the backup media.
Restore the copies of the log files from the backup media, in the order in which they were written. (It's possible that the same log file appears on multiple backups, and you only want the most recent version of that log file!)
Run db_recover -c to recover the database.

It is possible to recreate the database in a location different than the original, by specifying appropriate pathnames to the -h option of the db_recover utility.

Archival to conserve log file space:

To remove log files, the following steps should be taken:

If you are concerned with catastrophic failure, first copy them to backup media (e.g., tape), as described above. This is because log files are necessary for recovery from catastrophic failure.
Run db_archive without options, to identify all of the log files that are no longer in use (e.g., no longer involved in an active transaction).
Remove those log files from the system.