The third component of the infrastructure, archival procedures, concerns the recoverability of the database and the disk consumption of the database log files.
There are two separate aspects to these issues. First, you may want to periodically create snapshots (i.e., backups) of your databases to make it possible to recover them from catastrophic failure. Second, you'll want to periodically remove log files in order to conserve disk space. The two procedures are distinct from each other, and you cannot remove the current log files simply because you have created a database snapshot.
To create a snapshot of your database that can be used to recover from catastrophic failure, the following steps should be taken:
More importantly, if any of the database files have not been accessed during the lifetime of the current log files, db_archive will not list them in its output! For this reason, it may be important to use a separate database file directory, archiving it instead of the files listed by db_archive.
Note that the order of these operations is important, and that the database files must be archived before the log files.
The Berkeley DB library supports on-line backups, and it is not necessary to stop reading or writing your databases during the time when you create this snapshot. It is important to note, however, that the snapshot of an active database will be consistent as of some unspecified time between the start of the archival and when archival is completed.
To create a snapshot as of a specific time, you must stop reading and writing your databases for the entire time of the archival, force a checkpoint (see db_checkpoint), and then archive the files listed by the db_archive utility's -s and -l options.
Once these steps are completed, your database can be recovered from catastrophic failure to its state as of the time the archival was done. To update your snapshot so that recovery from catastrophic failure is possible up to a new point in time, repeat step #2, copying all existing log files to a backup device.
Each time that a complete snapshot is made, i.e. all database and log files are copied to backup media, you may discard all previous snapshots and saved log files.
The time to restore from catastrophic failure is a function of the number of log records that have been written since the snapshot was originally created. Perhaps more importantly, the more separate pieces of backup media you use, the more likely that you will have a problem reading from one of them. For these reasons, it is often best to make snapshots on a regular basis.
For archival safety remember to ensure that you have multiple copies of your database backups, that you verify that your archival media is error-free, and that copies of your backups are stored off-site!
To restore your database after catastrophic failure, the following steps should be taken:
It is possible to recreate the database in a location different than the original, by specifying appropriate pathnames to the -h option of the db_recover utility.
To remove log files, the following steps should be taken: