|
Application-specific logging and recoveryBerkeley DB includes tools to assist in the development of application-specific logging and recovery. Specifically, given a description of the information to be logged, these tools will automatically create logging functions (functions that take the values as parameters and construct a single record that is written to the log), read functions (functions that read a log record and unmarshall the values into a structure that maps onto the values you chose to log), a print function (for debugging), templates for the recovery functions, and automatic dispatching to your recovery functions. Defining Application-Specific OperationsLog records are described in files named XXX.src, where "XXX" is a unique prefix. The prefixes currently used in the Berkeley DB package are btree, crdel, db, hash, log, qam, and txn. These files contain interface definition language descriptions for each type of log record that is supported. All lines beginning with a hash character in .src files are treated as comments. The first non-comment line in the file should begin with the keyword PREFIX, followed by a string that will be prepended to every function. Frequently, the PREFIX is either identical or similar to the name of the .src file. The rest of the file consists of one or more log record descriptions. Each log record description begins with the line: BEGIN RECORD_NAME RECORD_NUMBER and ends with the line: END The RECORD_NAME variable should be replaced with a unique record name for this log record. Record names must be unique within .src files. The RECORD_NUMBER variable should be replaced with a record number. Record numbers must be unique for an entire application; that is, both application-specific and Berkeley DB log records must have unique values. Further, because record numbers are stored in log files, which often must be portable across application releases, no record number should ever be reused. The record number space below 10,000 is reserved for Berkeley DB itself; applications should choose record number values equal to or greater than 10,000. Between the BEGIN and END keywords there should be one line for each data item that will be logged in this log record. The format of these lines is as follows: ARG | DBT | POINTER variable_name variable_type printf_format The keyword ARG indicates that the argument is a simple parameter of the type specified. The keyword DBT indicates that the argument is a DBT containing a length and pointer to a byte string. The keyword PTR indicates that the argument is a pointer to the data type specified, and the entire type should be logged. The variable name is the field name within the structure that will be used to refer to this item. The variable type is the C type of the variable, and the printf format should be "s" for string, "d" for signed integral type, or "u" for unsigned integral type. Automatically Generated FunctionsThe .src file is processed using the gen_rec.awk script in the dist directory. This is an awk script that is executed from the dist directory with the following command line: awk -f gen_rec.awk \ -v subsystem = PREFIX \ -v source_file= C_FILE \ -v header_file = H_FILE \ -v template_file = TMP_FILE < XXX.src where PREFIX is the name specified in the PREFIX line in the .src file, C_FILE is the name of the file into which to place the automatically generated C code, H_FILE is the name of the file into which to place the automatically generated data structures and declarations, and TMP_FILE is the name of the file into which to place a template for the recovery routines. For example, in building the Berkeley DB logging and recovery routines for hash, the following command line is used: awk -f gen_rec.awk \ -v subsystem = hash \ -v source_file= ../hash/hash_auto.c \ -v header_file = ../include_auto/hash_auto.h \ -v template_file = template/rec_hash < hash.src For each log record description found in the .src file, the following structure declarations and #defines will be created in the file header_file: #define DB_PREFIX_RECORD_TYPE /* Integer ID number */ The template_file will contain a template for a recovery function. The recovery function is called on each record read from the log during system recovery or transaction abort and is expected to redo or undo the operations described by that record. The details of the recovery function will be specific to the record being logged and need to be written manually, but the template provides a good starting point. The template file should be copied to a source file in the application (but not the automatically generated source_file, as that will get overwritten each time gen_rec.awk is run) and fully developed there. The recovery function takes the following parameters:
In addition to the header_file and template_file, a source_file is created, containing a log, read, recovery, print and getpgnos function for each record type. The log function marshalls the parameters into a buffer, and calls DB_ENV->log_put on that buffer returning 0 on success and non-zero on failure. The log function takes the following parameters:
The read function takes a buffer and unmarshalls its contents into a structure of the appropriate type. It returns 0 on success and non-zero on error. After the fields of the structure have been used, the pointer returned from the read function should be freed. The read function takes the following parameters:
The print function displays the contents of the record. The print function takes the same parameters as the recovery function described previously. Although some of the parameters are unused by the print function, taking the same parameters allows a single dispatch loop to dispatch to a variety of functions. The print function takes the following parameters:
The getpgnos function processes a log record and returns the set of pages accessed by the record. This function will not need to do anything for most application-specific log records. The getpgnos function takes the same parameters as the recovery function described previously. Although some of the parameters are unused by the getpgnos function, taking the same parameters allows a single dispatch loop to dispatch to a variety of functions. The getpgnos function takes the following parameters:
Three additional functions are also created for each .src file. The are initialization functions for each of the print routines, the getpgnos routines and the recovery routines. All three initialization functions take a single parameter:
The print initialization function registers the print routines for each log record type declared with the dispatch system, so the appropriate function is called from the dispatch loop. The getpgnos initialization function registers the getpgno routines for each log record type declared with the dispatch system, so the appropriate function is called from the dispatch loop. The recovery initialization function registers the recovery routines for each log record type declared with the dispatch system, so the appropriate function is called from the dispatch loop. Using Automatically Generated RoutinesApplications use the automatically generated functions, as follows:
The recovery functions are called in the three following cases:
For each log record type you declare, you must write the appropriate function to undo and redo the modifications. The shell of these functions will be generated for you automatically, but you must fill in the details. Your code must be able to detect whether the described modifications have been applied to the data. The function will be called with the "op" parameter set to DB_TXN_ABORT when a transaction that wrote the log record aborts, with DB_TXN_FORWARD_ROLL and DB_TXN_BACKWARD_ROLL during recovery, and with DB_TXN_APPLY on a replicated client. The actions for DB_TXN_ABORT and DB_TXN_BACKWARD_ROLL should generally be the same and the actions for DB_TXN_FORWARD_ROLL and DB_TXN_APPLY should generally be the same. For example each access method database page contains the log sequence number of the most recent log record that describes a modification to the page. When the access method changes a page, it writes a log record describing the change and including the log sequence number (LSN) that was on the page before the change. This LSN is referred to as the previous LSN. The recovery functions read the page described by a log record, and compare the LSN on the page to the LSN they were passed. If the page LSN is less than the passed LSN and the operation is an undo, no action is necessary (because the modifications have not been written to the page). If the page LSN is the same as the previous LSN and the operation is a redo, the actions described are reapplied to the page. If the page LSN is equal to the passed LSN and the operation is an undo, the actions are removed from the page; if the page LSN is greater than the passed LSN and the operation is a redo, no further action is necessary. If the action is a redo and the LSN on the page is less than the previous LSN in the log record, it is an error because it could happen only if some previous log record was not processed. Please refer to the internal recovery functions in the Berkeley DB library (found in files named XXX_rec.c) for examples of the way recovery functions should work. Non-conformant LoggingIf your application cannot conform to the default logging and recovery structure, you will have to create your own logging and recovery functions explicitly. If you do not use the default recovery system, you need to construct your own recovery process based on the recovery program provided in db_recover/db_recover.c. Note that your recovery functions need to correctly process the log records produced by calls to DB_ENV->txn_begin and DB_TXN->commit. |