Update storage design document and add some TODOs to the code

2021-03-04 18:03:46 -03:00 · 2021-03-04 18:03:46 -03:00 · e77d5cad2c
commit e77d5cad2c
parent 858954e82d
4 changed files with 89 additions and 53 deletions
--- a/storage/doc/design.md
+++ b/storage/doc/design.md
@ -28,30 +28,31 @@ There is also the **chunks cache** to cache information about available chunks.

 ## Making a backup

-A backup run is ideally (automatically) triggered when
+A backup run is usually triggered automatically when

 * the device is charging and connected to an un-metered network in case network storage is used
-* a storage medium is plugged in and the user confirmed the run in case removable storage is used
+* a storage medium is plugged in (and the user confirmed the run) in case removable storage is used

-Files to be backed up are listed based on the user's preference
+Files to be backed up are scanned based on the user's preference
 using Android's `MediaProvider` and `ExternalStorageProvider`.
-Tests on real world devices have shown ~200ms list times for `MediaProvider`
+Tests on real world devices have shown ~200ms scan times for `MediaProvider`
 and `~10sec` for *all* of `ExternalStorageProvider`
 (which is unlikely to happen, because the entire storage volume cannot be selected on Android 11).

-All files will be processed with every backup run.
+All files included in backups will be scanned with every backup run.
 If a file is found in the cache, it is checked
-if its content-modification-indicating attributes have not been modified
+if its content-modification-indicating
+(size, lastModified and generation for media files)
+have not been modified
 and all its chunks are still present in the backup storage.
-We might be able to speed up the latter check by initially retrieving a list of all chunks.
+For the latter check, we initially retrieve a list of all chunks available on backup storage.

 For present unchanged files, an entry will be added to the backup snapshot
-and the TTL in the files cache updated.
-If a file is not found in cache an entry will be added for it.
+and the lastSeen timestamp in the files cache updated.
+If a file is not found in the cache, an entry will be added for it.
 New and modified files will be put through a chunker
 which splits up larger files into smaller chunks.
-Initially, the chunker might just return a single chunk,
-the file itself, to simplify the operation.
+Very small files are combined into larger zip chunks for transfer efficiency.

 A chunk is hashed (with a key / MACed),
 then (compressed and) encrypted (with authentication) and written to backup storage,
@ -68,9 +69,9 @@ and written (encrypted) to storage.

 If the backup fails, a new run is attempted at the next opportunity creating a new backup snapshot.
 Chunks uploaded during the failed run should still be available in backup storage
-and the cache with reference count `0` providing an auto-resume.
+and in the cache with reference count `0`, providing a seamless auto-resume.

-After a successful backup, chunks that still have reference count `0`
+After a *successful* backup run, chunks that still have reference count `0`
 can be deleted from storage and cache without risking to delete chunks that will be needed later.

 ## Removing old backups
@ -80,11 +81,14 @@ These could be a number in the yearly/monthly/weekly/daily categories.
 However, initially, we might simply auto-prune backups older than a month,
 if there have been at least 3 backups within that month (or some similar scheme).

-After doing a successful backup run, is a good time to prune old backups.
+After a successful backup run is a good time to prune old backups.
 To determine which backups to delete, the backup snapshots need to be downloaded and inspected.
-Maybe their file name can be the `timeStart` timestamp to help with that task.
+Their file name can be derived from their `timeStart` timestamp to help with that task.
 If a backup is selected for deletion, the reference counter of all included chunks is decremented.
-The backup snapshot file and chunks with reference count of `0` are deleted from storage.
+Note that a backup snapshot can reference a single chunk several times.
+The reference counter however refers to the number of snapshots references it,
+not the number of files.
+The backup snapshot file and chunks with reference count of `0` are then deleted from storage.

 ## Restoring from backup

@ -94,28 +98,29 @@ We go through the list of files in the snapshot,
 download, authenticate, decrypt (and decompress) each chunk of the file
 and re-assemble the file this way.
 Once we have the original chunk,
-we might want to re-calculate the chunk ID to check if it is as expected
-to prevent an attacker from swapping chunks.
-This could also be achieved by including the chunk ID
-in the associated data of the authenticated encryption (AEAD).
+we could re-calculate the chunk ID to prevent an attacker from swapping chunks.
+However, we instead include the chunk ID
+in the associated data of the authenticated encryption (AEAD) which should have the same effect.
 The re-assembled file will be placed into the same directory under the same name
 with its attributes (e.g. lastModified) restored as much as possible on Android.

 Restoring to storage that is already in use is not supported.
 However, if a file already exists with the that name and path,
-we check if the file is identical to the one we want to restore
+we could check if the file is identical to the one we want to restore
 (by relying on file metadata or re-computing chunk IDs)
 and move to the next if it is indeed identical.
-If it is not identical, we could rely on Android's Storage Access Framework
-to automatically give it a `(1)` suffix when writing it to disk.
+If it is not identical, we rely on Android's Storage Access Framework
+to automatically give it a `(1)` suffix when writing it to disk or add one manually.
 Normally, restores are expected to happen to a clean file system anyway.

-However, if a restore fails, the above behavior should give us a seamless auto-resume experience.
+However, if a restore fails, the above behavior (not implemented in first iteration)
+should give us a seamless auto-resume experience.
 The user can re-try the restore and it will quickly skip already restored files
 and continue to download the ones that are still missing.

 After all files have been written to a directory,
 we might want to attempt to restore its metadata (and flags?) as well.
+However, restoring directory metadata is not implemented in first iteration.


 # Cryptography
@ -134,12 +139,12 @@ Seedvault already uses [BIP39](https://github.com/bitcoin/bips/blob/master/bip-0
 to give users a mnemonic recovery code and for generating deterministic keys.
 The derived key has 512 bits
 and Seedvault uses the first 256 bits as an AES key to encrypt app data (out of scope here).
-This key's usage is limited by Android for encryption and decryption.
+Unfortunately, this key's usage is currently limited by Android to encryption and decryption.
 Therefore, the second 256 bits will be imported into Android's keystore for use with HMAC-SHA256,
 so that this key can act as a master key we can deterministically derive additional keys from
 by using HKDF ([RFC5869](https://tools.ietf.org/html/rfc5869)).
 These second 256 bits must not be used for any other purpose in the future.
-We use them for a master key to avoid users having to handle another secret.
+We use them for a master key to avoid users having to handle yet another secret.

 For deriving keys, we are only using the HKDF's second 'expand' step,
 because the Android Keystore does not give us access
@ -162,12 +167,13 @@ due to difficulties of building those as part of AOSP.
 We use a keyed hash instead of a normal hash for calculating the chunk ID
 to not leak the file content via the public hash.
 Using HMAC-SHA256 directly with the master key in Android's key store
-resulted in terrible throughput of around 4 MB/sec.
+resulted in terrible throughput of around 4 MB/sec,
+presumably because file data needs to enter the secure element to get hashed there.
 Java implementations of Blake2s and Blake3 performed better,
 but by far the best performance gave HMAC-SHA256
 with a key we can hold the byte representation for in memory.

-Therefore, we suggest to derive a dedicated key for chunk ID calculation from the master key
+Therefore, we derive a dedicated key for chunk ID calculation from the master key
 and keep it in memory for as long as we need it.
 If an attacker is able to read our memory,
 they have access to the entire device anyway
@ -201,15 +207,15 @@ It adds its own 40 byte header consisting of header length (1 byte), salt and no
 Then it adds one or more segments, each up to 1 MB in size.
 All segments are encrypted with a fresh key that is derived by using HKDF
 on our stream key with another internal random salt (32 bytes) and associated data as info
-([documentation](https://github.com/google/tink/blob/master/docs/WIRE-FORMAT.md#streaming-encryption)).
+([documentation](https://github.com/google/tink/blob/v1.5.0/docs/WIRE-FORMAT.md#streaming-encryption)).

 When writing files/chunks to backup storage,
 the authenticated associated data (AAD) will contain the backup version as the first byte
 (to prevent downgrade attacks)
 followed by a second type byte depending on the type of file written:

-* chunks: `0x00` as type byte and then the chunk ID
-* backup snapshots: `0x01` as type byte and then the backup snapshot timestamp
+* chunks: `0x00` as type byte and then the byte representation of the chunk ID
+* backup snapshots: `0x01` as type byte and then the backup snapshot timestamp as int64 bytes

 The chunk ID and the backup snapshot timestamp get added
 to prevent an attacker from renaming and swapping files/chunks.
@ -221,8 +227,8 @@ to prevent an attacker from renaming and swapping files/chunks.
 ### Files cache

 This cache is needed to quickly look up if a file has changed and if we have all of its chunks.
-It will probably be implemented as a sqlite-based Room database
-which has shown promising performance in early tests.
+It is implemented as a sqlite-based Room database
+which had shown promising performance in early tests.

 Contents:

@ -232,7 +238,7 @@ Contents:
 * generation modified (MediaStore only) (`Long`)
 * list of chunk IDs representing the file's contents
 * zip index in case this file is inside a single zip chunk (`Integer`)
-* last seen in milliseconds (`Long`)
+* last seen in epoch milliseconds (`Long`)

 If the file's size, last modified timestamp (and generation) is still the same,
 it is considered to not have changed.
@ -242,41 +248,53 @@ If the file has not changed and all chunks are present,
 the file is not read/chunked/hashed again.
 Only file metadata is added to the backup snapshot.

-As the cache grows over time, we need a way to evict files eventually.
-This could happen by checking a last seen timestamp or by using a TTL counter
-or maybe even a boolean flag that gets checked after a successful run over all files.
-A flag might not be ideal if the user adds/removes folder as backup targets.
-Current preference is for using a last seen timestamp.
+If a file's URI should ever change, it will be considered as a new file,
+so read/chunked/hashed again, but if it hasn't otherwise changed,
+its chunks will not be written to storage again
+(except for small files that get added to a new zip chunk).
+
+As the cache grows over time, we need a way to evict files eventually
+(not implemented in first iteration).
+This can happen by checking the last seen timestamp
+and delete all files we haven't seen for some time (maybe a month).

 The files cache is local only and will not be included in the backup.
-After restoring from backup the cache needs to get repopulated on the next backup run
-after re-generating the chunks cache.
+After restoring from backup the cache needs to get repopulated on the next backup run.
+This will happen automatically, because before each backup run we check cache consistency
+and repopulate the cache if we find it inconsistent with what we have in backup storage.
 The URIs of the restored files will most likely differ from the backed up ones.
 When the `MediaStore` version changes,
-the chunk IDs of all files will need to get recalculated as well.
+the chunk IDs of all files will need to get recalculated as well
+(not implemented in first iteration),
+because we can't be sure about their new state.

 ### Chunks cache

 This is used to determine whether we already have a chunk,
 to count references to it and also for statistics.

-It could be implemented as a table in the same database as the files cache.
+It is implemented as a table in the same database as the files cache.

 * chunk ID (hex representation of the chunk's MAC)
 * reference count
 * size
+* backup version byte (currently 0)

 If the reference count of a chunk reaches `0`,
-we can delete it from storage (after a successful backup)
-as it isn't used by a backup anymore.
+we can delete it from storage (after a successful backup run)
+as it isn't used by a backup snapshot anymore.

 References are only stored in this local chunks cache.
 If the cache is lost (or not available after restoring),
 it can be repopulated by inspecting all backup snapshots
 and setting the reference count to the number of backup snapshots a chunk is referenced from.

-When making a backup run and hitting the files cache,
-we need to check that all chunks are still available on storage.
+When making a backup run and hit the files cache,
+we check that all chunks are still available on storage.
+
+The backup version number of a chunk is stored, so we can know without downloading the chunk
+with what backup version it was written.
+This might be useful when increasing the backup version and changing the chunk format in the future.

 ## Remote Files

@ -325,11 +343,11 @@ The filename is the timeStart timestamp.
 ### Chunks

 The encrypted payload of chunks is just the chunk data itself.
-All chunks are stored in one of 256 sub-folders
+We suggest that file-system based storage plugins store chunks in one of 256 sub-folders
 representing the first byte of the chunk ID encoded as a hex string.
 The file name is the chunk ID encoded as a (lower-case) hex string.
 This is similar to how git stores its repository objects
-and to avoid having to store all chunks in a single directory.
+and to avoid having to store all chunks in a single directory which might not scale.

 ### Zip chunks

@ -353,11 +371,13 @@ When creating the next backup, if none of the small files have changed,
 we just increase the ref count on the existing chunk.
 If some of them have changed, they will be added to a new zip chunk
 together with other new/changed small files.
+Hanging on to the old file inside the still referenced zip chunk longer than necessary
+should be ok as these files are small.

 When fetching a chunk for restore, we know in advance whether it is a zip chunk,
 because the file we need it for contains the zip index,
 so we will not confuse it with a medium-sized zip file.
-Then we unzip the zip chunk and extract the file by its zip index (name).
+Then we unzip the zip chunk and extract the file by its zip index.

 # Out-of-Scope

@ -365,12 +385,12 @@ The following features would be nice to have,
 but are considered out-of-scope of the current design for time and budget reasons.

 * compression (we initially assume that most files are already sufficiently compressed)
-* packing several smaller files into larger combined chunks to improve transfer efficiency
 * using a rolling hash to produce chunks in order to increase likelihood of obtaining same chunks
  even if file contents change slightly or shift
 * external secret-less corruption checks that would use checksums over encrypted data
 * supporting different backup clients backing up to the same storage
 * concealing file sizes (though zip chunks helps a bit here)
+* implementing different storage plugins

 # Known issues

@ -386,6 +406,18 @@ to help with this issue.
 However, files on external storage still don't have anything similar
 and usually also don't trigger `ContentObserver` notifications.

+## Android's Storage Access Framework can be unreliable
+
+Since Seedvault already uses Android's Storage Access Framework (SAF) to store app backups,
+we re-use this storage that the user has already chosen.
+So we can avoid making the user choose two storage location
+and to avoid having to implement another storage backend in the first iteration.
+However, the SAF can be backed by different storage providers which are not equally reliable.
+Also, the API is very limited, doesn't allow for atomic operations
+and doesn't give feedback if file writes completed successfully as they happen asynchronously.
+The best solution will be to not (only) rely on this storage abstraction API,
+but at least offer different storage plugins that can operate more reliably.
+
 # Acknowledgements

 The following individuals have reviewed this document and provided helpful feedback.
--- a/storage/lib/src/main/java/org/calyxos/backup/storage/backup/Backup.kt
+++ b/storage/lib/src/main/java/org/calyxos/backup/storage/backup/Backup.kt
@ -80,7 +80,7 @@ internal class Backup(
        zipChunker = ZipChunker(mac, chunkWriter),
    )

-    @Throws(IOException::class)
+    @Throws(IOException::class, GeneralSecurityException::class)
    @OptIn(ExperimentalTime::class)
    suspend fun runBackup(backupObserver: BackupObserver?) {
        backupObserver?.onStartScanning()
--- a/storage/lib/src/main/java/org/calyxos/backup/storage/prune/Pruner.kt
+++ b/storage/lib/src/main/java/org/calyxos/backup/storage/prune/Pruner.kt
@ -49,7 +49,7 @@ internal class Pruner(
        backupObserver?.onPruneComplete(duration.toLongMilliseconds())
    }

-    @Throws(IOException::class, SecurityException::class)
+    @Throws(IOException::class, GeneralSecurityException::class)
    private suspend fun pruneSnapshot(timestamp: Long, backupObserver: BackupObserver?) {
        val snapshot = snapshotRetriever.getSnapshot(streamKey, timestamp)
        val chunks = HashSet<String>()
@ -60,6 +60,8 @@ internal class Pruner(
            chunksCache.decrementRefCount(it)
        }
        var size = 0L
+        // TODO add integration test for a failed backup that later resumes with unreferenced chunks
+        //  and here only deletes those that are still unreferenced afterwards
        val cachedChunksToDelete = chunksCache.getUnreferencedChunks()
        val chunkIdsToDelete = cachedChunksToDelete.map {
            if (it.refCount < 0) Log.w(TAG, "${it.id} has ref count ${it.refCount}")
--- a/storage/lib/src/main/java/org/calyxos/backup/storage/restore/AbstractChunkRestore.kt
+++ b/storage/lib/src/main/java/org/calyxos/backup/storage/restore/AbstractChunkRestore.kt
@ -38,6 +38,8 @@ internal abstract class AbstractChunkRestore(
        tag: String,
        streamWriter: suspend (outputStream: OutputStream) -> Long,
    ) {
+        // TODO check if the file exists already (same name, size, chunk IDs)
+        //  and skip it in this case
        fileRestore.restoreFile(file, observer, tag, streamWriter)
    }