Generally, the speed of a database system is measured by the transaction throughput, expressed as the number of transactions per second.
The two gating factors for Berkeley DB performance in a transactional system are usually the underlying database files and the log file, both because they require disk I/O, which is incredibly expensive relative to other resources, e.g., CPU.
In the worst case scenario:
This means that for each transaction, Berkeley DB is potentially performing several filesystem operations:
There are a number of ways to increase transactional throughput, all of which attempt to decrease the number of filesystem operations per transaction:
If you are bottlenecked on logging, the following test will help you confirm that the number of transactions per second that your application does is reasonable for the hardware on which you're running. Your test program should repeatedly perform the following operations:
The number of times that you can perform these three operations per second is a rough measure of the number of transactions per second of which the hardware is capable. This test simulates the operations applied to the log file. (As a simplifying assumption in this experiment, we assume that the database files are either on a separate disk, or that they fit, with some few exceptions, into the database cache.) We do not have to directly simulate updating the log file directory information, as it will normally be updated and flushed to disk as a result of flushing the log file write to disk.
Running this test program on reasonably standard commodity hardware (Pentium II CPU, SCSI disk), returned the following results:
% testfile -b256 -o1000 running: 1000 ops Elapsed time: 16.641934 seconds 1000 ops: 60.09 ops per second
Note that the number of bytes being written to the log as part of each transaction can dramatically affect the transaction throughput. The above test run used 256, which is a reasonable size log write. (Your log writes may be different. To determine your average log write size, use the db_stat utility to display out your log statistics.)
As a quick sanity check, for this particular disk, the average seek time is 9.4 msec, and the average latency is 4.17 msec. That results in a minimum requirement for a data transfer to the disk of 13.57 msec, or a maximum of 74 transfers per second. This is close enough to the above 60 operations per second (which wasn't done on a quiescent disk!) that the number is believable.
The above example test program (for POSIX 1003.1 standard systems) is available here.
Future releases of Berkeley DB are expected to include group commit and the ability to stripe log files across multiple disks, both of which will offer additional options for increasing transaction performance.