Bug #47
No sync to disk after timestamp index update
0%
Description
In gdplogd
timestamp index maintenance, there ought to be a DB->sync()
after any update (See source:gdplogd/logd_disklog.c#L294). At present, after a record append, the last-modified timestamp on the files in the log-server's filesystem shows that the gdptidx
file is not updated for a while (probably stays cached in memory). The read-by-timestamp
works well, which means that the index is updated (at least in the memory).
History
#1 Updated by Eric Allman almost 7 years ago
- Status changed from New to Resolved
DB->sync
does an fsync
system call, which is slow and expensive. Doing three on every write (since it would have to happen for the data file and both indices) is not a good idea.
On the other hand, I have changed the code so that all three files are synced when the log is closed, and more importantly I close all logs when the daemon shuts down. This should reduce the window somewhat.
Nitesh and I discussed having some sort of "fsck for logs" functionality. That could be done either as a separate program that is run before gdplogd
starts up or as a part of gdplogd
as it starts up. The advantage of the former is that it reduces the code in the daemon, which we want to keep as small as possible. The advantage of the latter is that it might enable recovery while the server was running (probably if it detects a corrupt log). At some point we definitely need such a utility.
No matter how it is implemented, any "fsck for logs" has the potential for being very time consuming, since it might have to read the entire data file in order to rebuild the indices. One way around that might be to sync the on-disk files every N writes or M milliseconds, thus bounding the likely corruption. However, doing a thorough scan would still require reading all of the files in their entirety.
#2 Updated by Nitesh Mor almost 7 years ago
- Status changed from Resolved to Closed
Doing a sync at the time of closing and a potential recovery operation before/during gdplogd startup seems to be a good enough compromise, given the cost of fsync with each append.
Closing this issue.