I recently noticed that the cost for ListBucket calls was increasing for an
application that was using Litestream. After investigating it seemed that the
bucket had retained the entire history of data, while Litestream was
continually logging that it was deleting the same data:
```
2022-10-30T12:00:27Z (s3): wal segmented deleted before 0792d3393bf79ced/00000233: n=1428
<snip>
2022-10-30T13:00:24Z (s3): wal segmented deleted before 0792d3393bf79ced/00000233: n=1428
```
This is occuring because the DeleteObjects call is a batch item, that returns
the individual object deletion errors in the response[1]. The S3 replica client
discards the response, and only handles errors in the original API call. I had
a misconfigured IAM policy that meant all deletes were failing, but this never
actually bubbled up as a real error.
To fix this, I added a check for the response body to handle any errors the
operation might have encountered. Because this may include a large number of
errors (in this case 1428 of them), the output is summarized to avoid an overly
large error message. When items are not found, they will not return an error[2]
- they will still be marked as deleted, so this change should be in-line with
the original intentions of this code.
1: https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html#API_DeleteObjects_Example_2
2: https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html
Previously, Litestream would avoid closing the SQLite3 connection
in order to ensure that the WAL file was not cleaned up by the
database if it was the last connection. This commit changes the
behavior by introducing a file control call to perform the same
action. This allows us to close the database file normally in all
cases.
Previously, a bug was introduced that added a `LIST` operation
on every replica sync which significantly increased the cost of
running Litestream against S3. This changes the behavior to only
issue the `LIST` operation when the generation has changed.
Checksum mismatch can regularly occur now that write locks have
been removed during WAL sync. This does not pose any corruption
risk but does sound scary to end users. Moving this to trace
logging instead.
This commit adds the ability to run a subcommand through Litestream.
Shutting down the subcommand will cause Litestream to gracefully
shutdown. Litestream will forward interrupt signals and wait for
the subprocess to shutdown.
This commit adds a flag to `litestream restore` to skip the restore
if the database file already exists. It is useful when using the
official Litestream Docker image and you don't have the ability to
add a script to check for the existence of the file.
Previously, S3 would default to sync every 10s when using a config
file but only every 1s when using the replica URL on the command
line. Obviously, this is confusing.
A sync interval of 1s could incur a cost of $1.30/month, however,
in practice applications are not receiving a constant stream of
writes so the cost is typically only a few pennies. This lower
sync interval will provide a smaller window for data loss in the
event of a catastrophic failure at a neglible cost.