information. We outline some worst case scenarios, and justify our
implementation. Suppose log file 1 (“requests” with
“sizes”) looks like:
while log file 2 looks like:
We report on the top 2 biggest requests, so the report from log 1
looks like:
while the report from log 2 would look like:
Now we change the superservice.cfg file to list the top-4 biggest
items. A naive merge would lead to:
Of course, this should've been:
This effect does not occur when keeping the top-limit to the same
value. However, when we're not reporting on distinct values in the
log, but are summing, more horrible things might happen. Consider
this: We want to report on the total size by client. Logs look
like:
and
Reports from these logs would look like:
After naively merging, one would get:
In fact, the complete report should look like:
Luckily, the Lire merging algorithm is not
this naive: in fact, the XML reports
store a little more records than actually needed. This
heuristic trick leads to sane merged reports in most cases.
However, since this is merely a heuristic trick, it is no
waterproof guarantee.
See the description of the guess_extra_entries routine in the
Lire::AsciiDlf::Group manpage for more implementation details.
|