Remove fuse

This commit is contained in:
Ben Johnson
2020-12-17 15:15:01 -07:00
parent bbcdb30cb3
commit b00095ccf5
13 changed files with 187 additions and 1116 deletions

View File

@@ -1,38 +1,76 @@
Litestream Design
=================
DESIGN
======
Litestream provides a file system layer to intercept writes to a SQLite database
to construct a persistent write-ahead log that can be replicated.
Litestream is a sidecar process that replicates the write ahead log (WAL) for
a SQLite database. To ensure that it can replicate every page, litestream takes
control over the checkpointing process by issuing a long running read
transaction against the database to prevent checkpointing. It then releases
this transaction once it obtains a write lock and issues the checkpoint itself.
The daemon polls the database on an interval to breifly obtain a write
transaction lock and copy over new WAL pages. Once the WAL has reached a
threshold size, litestream will issue a checkpoint and a single page write
to a table called `_litestream` to start the new WAL.
## Workflow
When litestream first loads a database, it checks if there is an existing
sidecar directory which is named `.<DB>-litestream`. If not, it initializes
the directory and starts a new generation.
A generation is a snapshot of the database followed by a continuous stream of
WAL files. A new generation is started on initialization & whenever litestream
cannot verify that it has a continuous record of WAL files. This could happen
if litestream is stopped and another process checkpoints the WAL. In this case,
a new generation ID is randomly created and a snapshot is replicated to the
appropriate destinations.
Generations also prevent two servers from replicating to the same destination
and corrupting each other's data. In this case, each server would replicate
to a different generation directory. On recovery, there will be duplicate
databases and the end user can choose which generation to recover but each
database will be uncorrupted.
## File Layout
Litestream maintains a shadow WAL which is a historical record of all previous
WAL files. These files can be deleted after a time or size threshold but should
be replicated before being deleted.
### Local
Given a database file named `db`, SQLite will create a WAL file called `db-wal`.
Litestream will then create a hidden directory called `.db-litestream` that
contains the historical record of all WAL files for the current generation.
```
dir/
db # SQLite database
db-wal # SQLite WAL
db.litestream # per-db configuration
.db-litestream/
log # recent event log
stat # per-db Prometheus statistics
generation # current generation number
wal/ # each WAL file contains pages in flush interval
active.wal # active WAL file exists until flush; renamed
000000000000001.wal.gz # flushed, compressed WAL files
000000000000002.wal.gz
db # SQLite database
db-wal # SQLite WAL
.db-litestream/
generation # current generation number
generations/
xxxxxxxx/
wal/ # WAL files
000000000000001.wal
000000000000002.wal
000000000000003.wal # active WAL
```
### Remote (S3)
```
bkt/
db/ # database path
00000001/ # snapshot directory
snapshot # full db snapshot
000000000000001.wal.gz # compressed WAL file
000000000000002.wal.gz
db/ # database path
generations/
xxxxxxxx/
snapshots/ # snapshots w/ timestamp+offset
20000101T000000Z-000000000000023.snapshot
wal/ # compressed WAL files
000000000000001-0.wal.gz
000000000000001-<offset>.wal.gz
000000000000002-0.wal.gz
00000002/
snapshot/
000000000000000.snapshot
@@ -48,53 +86,3 @@ bkt/
```
## Process
### File System Startup
File system startup:
1. Load litestream.config file.
2. Load all per-db ".litestream" files.
### DB startup:
```
IF "db" NOT EXISTS {
ensureWALRemovedIfDBNotExist()
restore()
setDBStatus("ok")
return
}
IF "-wal" EXISTS {
syncToShadowWAL()
IF err {
setDBStatus("error")
} ELSE {
setDBStatus("ok")
}
} ELSE {
ensureShadowWALMatchesDB() // check last page written to DB
IF err {
setDBStatus("error")
} ELSE {
setDBStatus("ok")
}
}
```
### DB Recovery
TODO
### WAL Write
1. Write to regular WAL
2. On fsync to regular WAL, copy WAL to shadow WAL.
2a. On copy error, mark errored & begin recovery