KeePass

KeePass
Password Safe








Get KeePass

Get Another Backup Plugin for KeePass at SourceForge.net. Fast, secure and Free Open Source software downloads

Internal Design of ABP

Warning: This page is for geeks only. If your social skills are in the normal range, and you are reading this in softcopy, you are advised to click the back button immediately; if you are reading this in hardcopy, drop it and back away slowly.

Aspects: Below are described two aspects of the ABP internal design: the file comparison approach, and the use of multithreading. Both of these aspects are intended to improve performance; neither affects functionality.

Database File Comparisons

Why Compare Files? The password database file is typically not changed very often, compared with the number of times KeePass is invoked. Therefore, at open time, the initial database file is usually already the same as the backup files. In this typical case, no copying is needed. Gratuitous copying is undesirable for at least four reasons:

  1. On some media, copying will update the timestamp of the backup copy to the current time, possibly misleading clients regarding the database's true age.
  2. For many media, writing is noticeably slower than just reading.
  3. In some anomalous cases, the backup file could (temporarily) be R/O but still up-to-date. A R/O strategy would still succeed, while a write strategy would fail.
  4. If you write, you need to write the whole file; but if you read, you only need to read a small part of the file. (But this is getting ahead of the story. See next paragraph.)

Thus, we are led to reading the backup file, and comparing it with the initial database, which is already in memory. Only in the unlikely event that they compare unequal does the initial database need to be copied.

Comparing Headers Only: It turns out that comparing the files can be performed much more efficiently than by reading the entire backup file, and comparing it byte-for-byte with the initial database file. The reason is that KeePass database files begin with a 124-byte header containing a cryptographically-secure hash of the rest of the file. Thus, it is sufficient to compare the headers alone. It is not necessary to read the entire backup file, but only its first 124 bytes. This means that the work to compare large password databases is no greater than that for comparing small ones.

Multithreading

Multithreading Prevents Slowing the GUI: ABP dedicates a worker thread to each backup path. The reason for this design choice is to minimize I/O latency, which is known to be large in some cases of interest. For example, if Windows® XP power management turns off an unused serial ATA hard disk, it has been found to take a full 10 seconds to power it up and do I/O. Avoiding latencies such as this during KeePass start-up, as well as when saving the database, was an important design goal. Since the main thread does not do backup file I/O, it can be responsive to end user actions, even in the presence of high latency backups.

Boost Threads Library Simplifies Implementation: Multithreaded programming in C++ is well known to be a black art. To reduce the likelihood of bugs, it was decided to use the high level open source Boost thread library, a library in the Boost project. Herb Sutter and Andrei Alexandrescu, in C++ Coding Standards (page 147), call the Boost project “...one of the most highly regarded and expertly designed C++ library projects in the world.” An added advantage of the Boost thread library over the use of Windows primitives is that it does not limit portability of ABP.

Windows® Outwits Sophisticated Design: An important irony was found after implementing the multithreaded approach. Although the implementation was demonstrated to work as designed, the objective of reducing latency for power-managed serial ATA hard disks was not achieved. The reason is that when power management turns on a serial ATA hard disk in response to an I/O request from a thread, the Windows® XP operating system seems to suspend not only the requesting thread, but other threads as well, including the KeePass main thread. It is suspected that even threads in other processes are suspended! Only after the serial ATA hard disk is ready for I/O do these threads become dispatchable. It is not known whether this unexpected behavior is logically required, or is simply a “feature” of the Windows® XP operating system.

Multithreaded Design Still Useful: In spite of the disappointment regarding SATA disks, the multithreaded approach is still believed to serve the intended purpose for other backup configurations, such as backing up to networked computers. With the advent of multiprocessor PCs and other parallelisms, backups to multiple physical devices can take place simultaneously with the GUI thread, and simultaneously with each other.