Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Azure Storage SDK to "track 2" #459

Open
wants to merge 100 commits into
base: main
Choose a base branch
from

Conversation

ItalyPaleAle
Copy link

This application currently depends on an older version of the Azure Storage SDK, internally called "track 1", which is going to be deprecated in the not-too-distant future.

This PR updates litestream to use the newer Azure Storage SDK ("track 2").

Once this is merged, it should be relatively easy to also add support for authentication with Azure AD, which includes using service principals and even MSI (#200)

Important: the new SDK requires Go 1.18 so I've updated the version of Go used in this package to Go 1.19 (Go 1.20 was released yesterday, but it's not widely available yet!)

CI tests are passing locally.

benbjohnson and others added 30 commits June 14, 2021 15:24
This commit changes the replica path format to group segments within
a single index in the same directory. This is to eventually add the
ability to seek to a record on file-based systems without having
to iterate over the records. The DB shadow WAL will also be changed
to this same format to support live replicas.
Per the godoc on Replica.Restore and RestoreOptions.OutputPath,
Replica.db.path should be used when RestoreOptions.OutputPath is empty.

Fixes benbjohnson#233
This commit fixes an issue where the reference is taken
on the loop variable rather than the slice element when
computing the minimum snapshot within a generation so
it can cause the wrong snapshot to be chosen.
By default, the snapshots command seems to output in alphabetical order of hash, which isn't meaningful, as far as I can tell.

This change modifies the order of the command output so that ./litestream snapshots returns snapshots from newest to oldest.
This commit refactors out the complexity of downloading ordered WAL
files in parallel to a type called `WALDownloader`. This makes it
easier to test the restore separately from the download.
Bumps [github.com/pierrec/lz4/v4](https://github.com/pierrec/lz4) from 4.1.3 to 4.1.12.
- [Release notes](https://github.com/pierrec/lz4/releases)
- [Commits](pierrec/lz4@v4.1.3...v4.1.12)

---
updated-dependencies:
- dependency-name: github.com/pierrec/lz4/v4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.27.0 to 1.42.39.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](aws/aws-sdk-go@v1.27.0...v1.42.39)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) from 1.15.0 to 1.18.2.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](googleapis/google-cloud-go@pubsub/v1.15.0...storage/v1.18.2)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/storage
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.45.0 to 0.65.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](googleapis/google-api-go-client@v0.45.0...v0.65.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.42.39 to 1.42.40.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](aws/aws-sdk-go@v1.42.39...v1.42.40)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
benbjohnson and others added 27 commits April 3, 2022 11:55
export LITESTREAM_ACCESS_KEY_ID=your_key_id
export LITESTREAM_SECRET_ACCESS_KEY=your_access_key
export LITESTREAM_ENDPOINT=your_endpoint
export LITESTREAM_REGION=your_region
litestream replicate fruits.db s3://mybkt/fruits.db
Signed-off-by: Ryan Russell <[email protected]>
I recently noticed that the cost for ListBucket calls was increasing for an
application that was using Litestream. After investigating it seemed that the
bucket had retained the entire history of data, while Litestream was
continually logging that it was deleting the same data:

```
2022-10-30T12:00:27Z (s3): wal segmented deleted before 0792d3393bf79ced/00000233: n=1428
<snip>
2022-10-30T13:00:24Z (s3): wal segmented deleted before 0792d3393bf79ced/00000233: n=1428
```

This is occuring because the DeleteObjects call is a batch item, that returns
the individual object deletion errors in the response[1]. The S3 replica client
discards the response, and only handles errors in the original API call. I had
a misconfigured IAM policy that meant all deletes were failing, but this never
actually bubbled up as a real error.

To fix this, I added a check for the response body to handle any errors the
operation might have encountered. Because this may include a large number of
errors (in this case 1428 of them), the output is summarized to avoid an overly
large error message. When items are not found, they will not return an error[2]
- they will still be marked as deleted, so this change should be in-line with
the original intentions of this code.

1: https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html#API_DeleteObjects_Example_2
2: https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html
@hifi
Copy link
Collaborator

hifi commented Dec 25, 2023

Hi,

I know it's been a while but would you still be interested in rebasing this PR? The intention seems good and the old SDK is indeed deprecated by now.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.