Update storage design document and add some TODOs to the code

This commit is contained in:
Torsten Grote 2021-03-04 18:03:46 -03:00 committed by Chirayu Desai
parent 858954e82d
commit e77d5cad2c
4 changed files with 89 additions and 53 deletions

View file

@ -28,30 +28,31 @@ There is also the **chunks cache** to cache information about available chunks.
## Making a backup ## Making a backup
A backup run is ideally (automatically) triggered when A backup run is usually triggered automatically when
* the device is charging and connected to an un-metered network in case network storage is used * the device is charging and connected to an un-metered network in case network storage is used
* a storage medium is plugged in and the user confirmed the run in case removable storage is used * a storage medium is plugged in (and the user confirmed the run) in case removable storage is used
Files to be backed up are listed based on the user's preference Files to be backed up are scanned based on the user's preference
using Android's `MediaProvider` and `ExternalStorageProvider`. using Android's `MediaProvider` and `ExternalStorageProvider`.
Tests on real world devices have shown ~200ms list times for `MediaProvider` Tests on real world devices have shown ~200ms scan times for `MediaProvider`
and `~10sec` for *all* of `ExternalStorageProvider` and `~10sec` for *all* of `ExternalStorageProvider`
(which is unlikely to happen, because the entire storage volume cannot be selected on Android 11). (which is unlikely to happen, because the entire storage volume cannot be selected on Android 11).
All files will be processed with every backup run. All files included in backups will be scanned with every backup run.
If a file is found in the cache, it is checked If a file is found in the cache, it is checked
if its content-modification-indicating attributes have not been modified if its content-modification-indicating
(size, lastModified and generation for media files)
have not been modified
and all its chunks are still present in the backup storage. and all its chunks are still present in the backup storage.
We might be able to speed up the latter check by initially retrieving a list of all chunks. For the latter check, we initially retrieve a list of all chunks available on backup storage.
For present unchanged files, an entry will be added to the backup snapshot For present unchanged files, an entry will be added to the backup snapshot
and the TTL in the files cache updated. and the lastSeen timestamp in the files cache updated.
If a file is not found in cache an entry will be added for it. If a file is not found in the cache, an entry will be added for it.
New and modified files will be put through a chunker New and modified files will be put through a chunker
which splits up larger files into smaller chunks. which splits up larger files into smaller chunks.
Initially, the chunker might just return a single chunk, Very small files are combined into larger zip chunks for transfer efficiency.
the file itself, to simplify the operation.
A chunk is hashed (with a key / MACed), A chunk is hashed (with a key / MACed),
then (compressed and) encrypted (with authentication) and written to backup storage, then (compressed and) encrypted (with authentication) and written to backup storage,
@ -68,9 +69,9 @@ and written (encrypted) to storage.
If the backup fails, a new run is attempted at the next opportunity creating a new backup snapshot. If the backup fails, a new run is attempted at the next opportunity creating a new backup snapshot.
Chunks uploaded during the failed run should still be available in backup storage Chunks uploaded during the failed run should still be available in backup storage
and the cache with reference count `0` providing an auto-resume. and in the cache with reference count `0`, providing a seamless auto-resume.
After a successful backup, chunks that still have reference count `0` After a *successful* backup run, chunks that still have reference count `0`
can be deleted from storage and cache without risking to delete chunks that will be needed later. can be deleted from storage and cache without risking to delete chunks that will be needed later.
## Removing old backups ## Removing old backups
@ -80,11 +81,14 @@ These could be a number in the yearly/monthly/weekly/daily categories.
However, initially, we might simply auto-prune backups older than a month, However, initially, we might simply auto-prune backups older than a month,
if there have been at least 3 backups within that month (or some similar scheme). if there have been at least 3 backups within that month (or some similar scheme).
After doing a successful backup run, is a good time to prune old backups. After a successful backup run is a good time to prune old backups.
To determine which backups to delete, the backup snapshots need to be downloaded and inspected. To determine which backups to delete, the backup snapshots need to be downloaded and inspected.
Maybe their file name can be the `timeStart` timestamp to help with that task. Their file name can be derived from their `timeStart` timestamp to help with that task.
If a backup is selected for deletion, the reference counter of all included chunks is decremented. If a backup is selected for deletion, the reference counter of all included chunks is decremented.
The backup snapshot file and chunks with reference count of `0` are deleted from storage. Note that a backup snapshot can reference a single chunk several times.
The reference counter however refers to the number of snapshots references it,
not the number of files.
The backup snapshot file and chunks with reference count of `0` are then deleted from storage.
## Restoring from backup ## Restoring from backup
@ -94,28 +98,29 @@ We go through the list of files in the snapshot,
download, authenticate, decrypt (and decompress) each chunk of the file download, authenticate, decrypt (and decompress) each chunk of the file
and re-assemble the file this way. and re-assemble the file this way.
Once we have the original chunk, Once we have the original chunk,
we might want to re-calculate the chunk ID to check if it is as expected we could re-calculate the chunk ID to prevent an attacker from swapping chunks.
to prevent an attacker from swapping chunks. However, we instead include the chunk ID
This could also be achieved by including the chunk ID in the associated data of the authenticated encryption (AEAD) which should have the same effect.
in the associated data of the authenticated encryption (AEAD).
The re-assembled file will be placed into the same directory under the same name The re-assembled file will be placed into the same directory under the same name
with its attributes (e.g. lastModified) restored as much as possible on Android. with its attributes (e.g. lastModified) restored as much as possible on Android.
Restoring to storage that is already in use is not supported. Restoring to storage that is already in use is not supported.
However, if a file already exists with the that name and path, However, if a file already exists with the that name and path,
we check if the file is identical to the one we want to restore we could check if the file is identical to the one we want to restore
(by relying on file metadata or re-computing chunk IDs) (by relying on file metadata or re-computing chunk IDs)
and move to the next if it is indeed identical. and move to the next if it is indeed identical.
If it is not identical, we could rely on Android's Storage Access Framework If it is not identical, we rely on Android's Storage Access Framework
to automatically give it a `(1)` suffix when writing it to disk. to automatically give it a `(1)` suffix when writing it to disk or add one manually.
Normally, restores are expected to happen to a clean file system anyway. Normally, restores are expected to happen to a clean file system anyway.
However, if a restore fails, the above behavior should give us a seamless auto-resume experience. However, if a restore fails, the above behavior (not implemented in first iteration)
should give us a seamless auto-resume experience.
The user can re-try the restore and it will quickly skip already restored files The user can re-try the restore and it will quickly skip already restored files
and continue to download the ones that are still missing. and continue to download the ones that are still missing.
After all files have been written to a directory, After all files have been written to a directory,
we might want to attempt to restore its metadata (and flags?) as well. we might want to attempt to restore its metadata (and flags?) as well.
However, restoring directory metadata is not implemented in first iteration.
# Cryptography # Cryptography
@ -134,12 +139,12 @@ Seedvault already uses [BIP39](https://github.com/bitcoin/bips/blob/master/bip-0
to give users a mnemonic recovery code and for generating deterministic keys. to give users a mnemonic recovery code and for generating deterministic keys.
The derived key has 512 bits The derived key has 512 bits
and Seedvault uses the first 256 bits as an AES key to encrypt app data (out of scope here). and Seedvault uses the first 256 bits as an AES key to encrypt app data (out of scope here).
This key's usage is limited by Android for encryption and decryption. Unfortunately, this key's usage is currently limited by Android to encryption and decryption.
Therefore, the second 256 bits will be imported into Android's keystore for use with HMAC-SHA256, Therefore, the second 256 bits will be imported into Android's keystore for use with HMAC-SHA256,
so that this key can act as a master key we can deterministically derive additional keys from so that this key can act as a master key we can deterministically derive additional keys from
by using HKDF ([RFC5869](https://tools.ietf.org/html/rfc5869)). by using HKDF ([RFC5869](https://tools.ietf.org/html/rfc5869)).
These second 256 bits must not be used for any other purpose in the future. These second 256 bits must not be used for any other purpose in the future.
We use them for a master key to avoid users having to handle another secret. We use them for a master key to avoid users having to handle yet another secret.
For deriving keys, we are only using the HKDF's second 'expand' step, For deriving keys, we are only using the HKDF's second 'expand' step,
because the Android Keystore does not give us access because the Android Keystore does not give us access
@ -162,12 +167,13 @@ due to difficulties of building those as part of AOSP.
We use a keyed hash instead of a normal hash for calculating the chunk ID We use a keyed hash instead of a normal hash for calculating the chunk ID
to not leak the file content via the public hash. to not leak the file content via the public hash.
Using HMAC-SHA256 directly with the master key in Android's key store Using HMAC-SHA256 directly with the master key in Android's key store
resulted in terrible throughput of around 4 MB/sec. resulted in terrible throughput of around 4 MB/sec,
presumably because file data needs to enter the secure element to get hashed there.
Java implementations of Blake2s and Blake3 performed better, Java implementations of Blake2s and Blake3 performed better,
but by far the best performance gave HMAC-SHA256 but by far the best performance gave HMAC-SHA256
with a key we can hold the byte representation for in memory. with a key we can hold the byte representation for in memory.
Therefore, we suggest to derive a dedicated key for chunk ID calculation from the master key Therefore, we derive a dedicated key for chunk ID calculation from the master key
and keep it in memory for as long as we need it. and keep it in memory for as long as we need it.
If an attacker is able to read our memory, If an attacker is able to read our memory,
they have access to the entire device anyway they have access to the entire device anyway
@ -201,15 +207,15 @@ It adds its own 40 byte header consisting of header length (1 byte), salt and no
Then it adds one or more segments, each up to 1 MB in size. Then it adds one or more segments, each up to 1 MB in size.
All segments are encrypted with a fresh key that is derived by using HKDF All segments are encrypted with a fresh key that is derived by using HKDF
on our stream key with another internal random salt (32 bytes) and associated data as info on our stream key with another internal random salt (32 bytes) and associated data as info
([documentation](https://github.com/google/tink/blob/master/docs/WIRE-FORMAT.md#streaming-encryption)). ([documentation](https://github.com/google/tink/blob/v1.5.0/docs/WIRE-FORMAT.md#streaming-encryption)).
When writing files/chunks to backup storage, When writing files/chunks to backup storage,
the authenticated associated data (AAD) will contain the backup version as the first byte the authenticated associated data (AAD) will contain the backup version as the first byte
(to prevent downgrade attacks) (to prevent downgrade attacks)
followed by a second type byte depending on the type of file written: followed by a second type byte depending on the type of file written:
* chunks: `0x00` as type byte and then the chunk ID * chunks: `0x00` as type byte and then the byte representation of the chunk ID
* backup snapshots: `0x01` as type byte and then the backup snapshot timestamp * backup snapshots: `0x01` as type byte and then the backup snapshot timestamp as int64 bytes
The chunk ID and the backup snapshot timestamp get added The chunk ID and the backup snapshot timestamp get added
to prevent an attacker from renaming and swapping files/chunks. to prevent an attacker from renaming and swapping files/chunks.
@ -221,8 +227,8 @@ to prevent an attacker from renaming and swapping files/chunks.
### Files cache ### Files cache
This cache is needed to quickly look up if a file has changed and if we have all of its chunks. This cache is needed to quickly look up if a file has changed and if we have all of its chunks.
It will probably be implemented as a sqlite-based Room database It is implemented as a sqlite-based Room database
which has shown promising performance in early tests. which had shown promising performance in early tests.
Contents: Contents:
@ -232,7 +238,7 @@ Contents:
* generation modified (MediaStore only) (`Long`) * generation modified (MediaStore only) (`Long`)
* list of chunk IDs representing the file's contents * list of chunk IDs representing the file's contents
* zip index in case this file is inside a single zip chunk (`Integer`) * zip index in case this file is inside a single zip chunk (`Integer`)
* last seen in milliseconds (`Long`) * last seen in epoch milliseconds (`Long`)
If the file's size, last modified timestamp (and generation) is still the same, If the file's size, last modified timestamp (and generation) is still the same,
it is considered to not have changed. it is considered to not have changed.
@ -242,41 +248,53 @@ If the file has not changed and all chunks are present,
the file is not read/chunked/hashed again. the file is not read/chunked/hashed again.
Only file metadata is added to the backup snapshot. Only file metadata is added to the backup snapshot.
As the cache grows over time, we need a way to evict files eventually. If a file's URI should ever change, it will be considered as a new file,
This could happen by checking a last seen timestamp or by using a TTL counter so read/chunked/hashed again, but if it hasn't otherwise changed,
or maybe even a boolean flag that gets checked after a successful run over all files. its chunks will not be written to storage again
A flag might not be ideal if the user adds/removes folder as backup targets. (except for small files that get added to a new zip chunk).
Current preference is for using a last seen timestamp.
As the cache grows over time, we need a way to evict files eventually
(not implemented in first iteration).
This can happen by checking the last seen timestamp
and delete all files we haven't seen for some time (maybe a month).
The files cache is local only and will not be included in the backup. The files cache is local only and will not be included in the backup.
After restoring from backup the cache needs to get repopulated on the next backup run After restoring from backup the cache needs to get repopulated on the next backup run.
after re-generating the chunks cache. This will happen automatically, because before each backup run we check cache consistency
and repopulate the cache if we find it inconsistent with what we have in backup storage.
The URIs of the restored files will most likely differ from the backed up ones. The URIs of the restored files will most likely differ from the backed up ones.
When the `MediaStore` version changes, When the `MediaStore` version changes,
the chunk IDs of all files will need to get recalculated as well. the chunk IDs of all files will need to get recalculated as well
(not implemented in first iteration),
because we can't be sure about their new state.
### Chunks cache ### Chunks cache
This is used to determine whether we already have a chunk, This is used to determine whether we already have a chunk,
to count references to it and also for statistics. to count references to it and also for statistics.
It could be implemented as a table in the same database as the files cache. It is implemented as a table in the same database as the files cache.
* chunk ID (hex representation of the chunk's MAC) * chunk ID (hex representation of the chunk's MAC)
* reference count * reference count
* size * size
* backup version byte (currently 0)
If the reference count of a chunk reaches `0`, If the reference count of a chunk reaches `0`,
we can delete it from storage (after a successful backup) we can delete it from storage (after a successful backup run)
as it isn't used by a backup anymore. as it isn't used by a backup snapshot anymore.
References are only stored in this local chunks cache. References are only stored in this local chunks cache.
If the cache is lost (or not available after restoring), If the cache is lost (or not available after restoring),
it can be repopulated by inspecting all backup snapshots it can be repopulated by inspecting all backup snapshots
and setting the reference count to the number of backup snapshots a chunk is referenced from. and setting the reference count to the number of backup snapshots a chunk is referenced from.
When making a backup run and hitting the files cache, When making a backup run and hit the files cache,
we need to check that all chunks are still available on storage. we check that all chunks are still available on storage.
The backup version number of a chunk is stored, so we can know without downloading the chunk
with what backup version it was written.
This might be useful when increasing the backup version and changing the chunk format in the future.
## Remote Files ## Remote Files
@ -325,11 +343,11 @@ The filename is the timeStart timestamp.
### Chunks ### Chunks
The encrypted payload of chunks is just the chunk data itself. The encrypted payload of chunks is just the chunk data itself.
All chunks are stored in one of 256 sub-folders We suggest that file-system based storage plugins store chunks in one of 256 sub-folders
representing the first byte of the chunk ID encoded as a hex string. representing the first byte of the chunk ID encoded as a hex string.
The file name is the chunk ID encoded as a (lower-case) hex string. The file name is the chunk ID encoded as a (lower-case) hex string.
This is similar to how git stores its repository objects This is similar to how git stores its repository objects
and to avoid having to store all chunks in a single directory. and to avoid having to store all chunks in a single directory which might not scale.
### Zip chunks ### Zip chunks
@ -353,11 +371,13 @@ When creating the next backup, if none of the small files have changed,
we just increase the ref count on the existing chunk. we just increase the ref count on the existing chunk.
If some of them have changed, they will be added to a new zip chunk If some of them have changed, they will be added to a new zip chunk
together with other new/changed small files. together with other new/changed small files.
Hanging on to the old file inside the still referenced zip chunk longer than necessary
should be ok as these files are small.
When fetching a chunk for restore, we know in advance whether it is a zip chunk, When fetching a chunk for restore, we know in advance whether it is a zip chunk,
because the file we need it for contains the zip index, because the file we need it for contains the zip index,
so we will not confuse it with a medium-sized zip file. so we will not confuse it with a medium-sized zip file.
Then we unzip the zip chunk and extract the file by its zip index (name). Then we unzip the zip chunk and extract the file by its zip index.
# Out-of-Scope # Out-of-Scope
@ -365,12 +385,12 @@ The following features would be nice to have,
but are considered out-of-scope of the current design for time and budget reasons. but are considered out-of-scope of the current design for time and budget reasons.
* compression (we initially assume that most files are already sufficiently compressed) * compression (we initially assume that most files are already sufficiently compressed)
* packing several smaller files into larger combined chunks to improve transfer efficiency
* using a rolling hash to produce chunks in order to increase likelihood of obtaining same chunks * using a rolling hash to produce chunks in order to increase likelihood of obtaining same chunks
even if file contents change slightly or shift even if file contents change slightly or shift
* external secret-less corruption checks that would use checksums over encrypted data * external secret-less corruption checks that would use checksums over encrypted data
* supporting different backup clients backing up to the same storage * supporting different backup clients backing up to the same storage
* concealing file sizes (though zip chunks helps a bit here) * concealing file sizes (though zip chunks helps a bit here)
* implementing different storage plugins
# Known issues # Known issues
@ -386,6 +406,18 @@ to help with this issue.
However, files on external storage still don't have anything similar However, files on external storage still don't have anything similar
and usually also don't trigger `ContentObserver` notifications. and usually also don't trigger `ContentObserver` notifications.
## Android's Storage Access Framework can be unreliable
Since Seedvault already uses Android's Storage Access Framework (SAF) to store app backups,
we re-use this storage that the user has already chosen.
So we can avoid making the user choose two storage location
and to avoid having to implement another storage backend in the first iteration.
However, the SAF can be backed by different storage providers which are not equally reliable.
Also, the API is very limited, doesn't allow for atomic operations
and doesn't give feedback if file writes completed successfully as they happen asynchronously.
The best solution will be to not (only) rely on this storage abstraction API,
but at least offer different storage plugins that can operate more reliably.
# Acknowledgements # Acknowledgements
The following individuals have reviewed this document and provided helpful feedback. The following individuals have reviewed this document and provided helpful feedback.

View file

@ -80,7 +80,7 @@ internal class Backup(
zipChunker = ZipChunker(mac, chunkWriter), zipChunker = ZipChunker(mac, chunkWriter),
) )
@Throws(IOException::class) @Throws(IOException::class, GeneralSecurityException::class)
@OptIn(ExperimentalTime::class) @OptIn(ExperimentalTime::class)
suspend fun runBackup(backupObserver: BackupObserver?) { suspend fun runBackup(backupObserver: BackupObserver?) {
backupObserver?.onStartScanning() backupObserver?.onStartScanning()

View file

@ -49,7 +49,7 @@ internal class Pruner(
backupObserver?.onPruneComplete(duration.toLongMilliseconds()) backupObserver?.onPruneComplete(duration.toLongMilliseconds())
} }
@Throws(IOException::class, SecurityException::class) @Throws(IOException::class, GeneralSecurityException::class)
private suspend fun pruneSnapshot(timestamp: Long, backupObserver: BackupObserver?) { private suspend fun pruneSnapshot(timestamp: Long, backupObserver: BackupObserver?) {
val snapshot = snapshotRetriever.getSnapshot(streamKey, timestamp) val snapshot = snapshotRetriever.getSnapshot(streamKey, timestamp)
val chunks = HashSet<String>() val chunks = HashSet<String>()
@ -60,6 +60,8 @@ internal class Pruner(
chunksCache.decrementRefCount(it) chunksCache.decrementRefCount(it)
} }
var size = 0L var size = 0L
// TODO add integration test for a failed backup that later resumes with unreferenced chunks
// and here only deletes those that are still unreferenced afterwards
val cachedChunksToDelete = chunksCache.getUnreferencedChunks() val cachedChunksToDelete = chunksCache.getUnreferencedChunks()
val chunkIdsToDelete = cachedChunksToDelete.map { val chunkIdsToDelete = cachedChunksToDelete.map {
if (it.refCount < 0) Log.w(TAG, "${it.id} has ref count ${it.refCount}") if (it.refCount < 0) Log.w(TAG, "${it.id} has ref count ${it.refCount}")

View file

@ -38,6 +38,8 @@ internal abstract class AbstractChunkRestore(
tag: String, tag: String,
streamWriter: suspend (outputStream: OutputStream) -> Long, streamWriter: suspend (outputStream: OutputStream) -> Long,
) { ) {
// TODO check if the file exists already (same name, size, chunk IDs)
// and skip it in this case
fileRestore.restoreFile(file, observer, tag, streamWriter) fileRestore.restoreFile(file, observer, tag, streamWriter)
} }