Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent Upload 500 error #171

Closed
2 of 6 tasks
bmc-msft opened this issue Feb 5, 2021 · 8 comments
Closed
2 of 6 tasks

Inconsistent Upload 500 error #171

bmc-msft opened this issue Feb 5, 2021 · 8 comments
Labels
bug Something isn't working

Comments

@bmc-msft
Copy link

bmc-msft commented Feb 5, 2021

Describe the bug
On some of my pipelines, I inconsistently get a 500 error on upload-artifact.

Version

  • V1
  • V2

Environment

  • self-hosted
  • Linux
  • Windows
  • Mac

Screenshots

With the provided path, there will be 1 file uploaded
Error: Unexpected response. Unable to upload chunk to https://pipelines.actions.githubusercontent.com/5VR8pVAJWOch570pQRgQ7EctEDmSm2LFZLvggBFEZdahLnzvWU/_apis/resources/Containers/6955550?itemPath=build-artifacts%2Fproxy%2Fonefuzz-proxy-manager
##### Begin Diagnostic HTTP information #####
Status Code: 500
Status Message: Internal Server Error
Header Information: {
  "cache-control": "no-store,no-cache",
  "pragma": "no-cache",
  "content-length": "254",
  "content-type": "application/json; charset=utf-8",
  "strict-transport-security": "max-age=2592000",
  "x-tfs-processid": "c4ac2f30-d5f3-47e6-ad17-8b0bddb290cc",
  "activityid": "7d535a7f-215d-472e-8894-fc729cefd82f",
  "x-tfs-session": "7d535a7f-215d-472e-8894-fc729cefd82f",
  "x-vss-e2eid": "7d535a7f-215d-472e-8894-fc729cefd82f",
  "x-vss-senderdeploymentid": "13a19993-c6bc-326c-afb4-32c5519f46f0",
  "x-frame-options": "SAMEORIGIN",
  "x-msedge-ref": "Ref A: 38A48C9593734CD082F8D6D1A7BC4991 Ref B: BN3EDGE0619 Ref C: 2021-02-05T01:00:31Z",
  "date": "Fri, 05 Feb 2021 01:00:31 GMT"
}

Run/Repo Url
https://github.com/microsoft/onefuzz/runs/1835434343

How to reproduce
N/A

Additional context
N/A

@bmc-msft bmc-msft added the bug Something isn't working label Feb 5, 2021
@joshuapinter
Copy link

I'm seeing a similar thing:

With the provided path, there will be 10 files uploaded
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #1. Waiting for 5548.578148597127 milliseconds before continuing the upload at offset 0
An error has been caught http-client index 1, retrying the upload
Error: Client has already been disposed.
    at HttpClient.request (.../_work/_actions/actions/upload-artifact/v2/dist/index.js:5694:19)
    at HttpClient.sendStream (.../_work/_actions/actions/upload-artifact/v2/dist/index.js:5655:21)
    at UploadHttpClient.<anonymous> (.../_work/_actions/actions/upload-artifact/v2/dist/index.js:7104:37)
    at Generator.next (<anonymous>)
    at .../_work/_actions/actions/upload-artifact/v2/dist/index.js:6834:71
    at new Promise (<anonymous>)
    at module.exports.608.__awaiter (.../_work/_actions/actions/upload-artifact/v2/dist/index.js:6830:12)
    at uploadChunkRequest (.../_work/_actions/actions/upload-artifact/v2/dist/index.js:7102:46)
    at UploadHttpClient.<anonymous> (.../_work/_actions/actions/upload-artifact/v2/dist/index.js:7139:38)
    at Generator.next (<anonymous>)
    Exponential backoff for retry #1. Waiting for 5773.927291417539 milliseconds before continuing the upload at offset 0
Finished backoff for retry #1, continuing with upload
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #2. Waiting for 9362.643214894475 milliseconds before continuing the upload at offset 0
Finished backoff for retry #1, continuing with upload
Total file count: 10 ---- Processed file #9 (90.0%)
Finished backoff for retry #2, continuing with upload
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #3. Waiting for 14276.414696905516 milliseconds before continuing the upload at offset 0
Total file count: 10 ---- Processed file #9 (90.0%)
Total file count: 10 ---- Processed file #9 (90.0%)
Finished backoff for retry #3, continuing with upload
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #4. Waiting for 20561.212693027403 milliseconds before continuing the upload at offset 0
Total file count: 10 ---- Processed file #9 (90.0%)
Total file count: 10 ---- Processed file #9 (90.0%)
Finished backoff for retry #4, continuing with upload
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #5. Waiting for 31215.480894221528 milliseconds before continuing the upload at offset 0
Total file count: 10 ---- Processed file #9 (90.0%)
Total file count: 10 ---- Processed file #9 (90.0%)
Total file count: 10 ---- Processed file #9 (90.0%)
Finished backoff for retry #5, continuing with upload
A 500 status code has been received, will attempt to retry the upload
##### Begin Diagnostic HTTP information #####
Status Code: 500
Status Message: Internal Server Error
Header Information: {
  "cache-control": "no-store,no-cache",
  "pragma": "no-cache",
  "content-length": "328",
  "content-type": "application/json; charset=utf-8",
  "strict-transport-security": "max-age=2592000",
  "x-tfs-processid": "...",
  "activityid": "...",
  "x-tfs-session": "...",
  "x-vss-e2eid": "...",
  "x-vss-senderdeploymentid": "...",
  "x-frame-options": "SAMEORIGIN",
  "x-cache": "CONFIG_NOCACHE",
  "x-msedge-ref": "Ref A: ... Ref B: ... Ref C: 2021-08-05T01:33:28Z",
  "date": "Thu, 05 Aug 2021 01:33:28 GMT"
}
###### End Diagnostic HTTP information ######
Retry limit has been reached for chunk at offset 0 to https://pipelines.actions.githubusercontent.com/.../_apis/resources/Containers/...?itemPath=...
Warning: Aborting upload for ... due to failure
Error: aborting artifact upload
Total size of all the files uploaded is 329038 bytes
Finished uploading artifact .... Reported size is 329038 bytes. There were 1 items that failed to upload
Error: An error was encountered when uploading .... There were 1 items that failed to upload.

(Redacted for privacy.)

@to-s
Copy link

to-s commented Nov 30, 2021

Got today the same:

Run actions/upload-artifact@v2
  with:
    name: ...
    path: ...
    retention-days: 5
    if-no-files-found: warn
  env:
    pythonLocation: /opt/hostedtoolcache/Python/3.8.12/x64
With the provided path, there will be 1 file uploaded
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #1. Waiting for 5155.795555228613 milliseconds before continuing the upload at offset 0
Finished backoff for retry #1, continuing with upload
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #2. Waiting for 11167.809749650718 milliseconds before continuing the upload at offset 0
Total file count: 1 ---- Processed file #0 (0.0%)
Finished backoff for retry #2, continuing with upload
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #3. Waiting for 16192.940004900629 milliseconds before continuing the upload at offset 0
Total file count: 1 ---- Processed file #0 (0.0%)
Total file count: 1 ---- Processed file #0 (0.0%)
Finished backoff for retry #3, continuing with upload
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #4. Waiting for 21995.75374447502 milliseconds before continuing the upload at offset 0
Total file count: 1 ---- Processed file #0 (0.0%)
Total file count: 1 ---- Processed file #0 (0.0%)
Finished backoff for retry #4, continuing with upload
A 500 status code has been received, will attempt to retry the upload
Exponential backoff for retry #5. Waiting for 26428.731199363385 milliseconds before continuing the upload at offset 0
Total file count: 1 ---- Processed file #0 (0.0%)
Total file count: 1 ---- Processed file #0 (0.0%)
Total file count: 1 ---- Processed file #0 (0.0%)
Finished backoff for retry #5, continuing with upload
A 500 status code has been received, will attempt to retry the upload
##### Begin Diagnostic HTTP information #####
Status Code: 500
Status Message: Internal Server Error
Header Information: {
  "cache-control": "no-store,no-cache",
  "pragma": "no-cache",
  "content-length": "328",
  "content-type": "application/json; charset=utf-8",
  "strict-transport-security": "max-age=2592000",
  "x-tfs-processid": "...",
  "activityid": "...",
  "x-tfs-session": "...",
  "x-vss-e2eid": "...",
  "x-vss-senderdeploymentid": "...",
  "x-frame-options": "SAMEORIGIN",
  "x-cache": "CONFIG_NOCACHE",
  "x-msedge-ref": "Ref A: ... Ref B: ... Ref C: 2021-11-30T13:08:04Z",
  "date": "Tue, 30 Nov 2021 13:08:04 GMT"
}
###### End Diagnostic HTTP information ######
Retry limit has been reached for chunk at offset 0 to https://pipelines.actions.githubusercontent.com/..._apis/resources/Containers/...?itemPath=...
Warning: Aborting upload for ... due to failure
Error: aborting artifact upload
Total size of all the files uploaded is 0 bytes
Finished uploading artifact eni-os-output. Reported size is 0 bytes. There were 1 items that failed to upload
Error: An error was encountered when uploading .... There were 1 items that failed to upload.

Currently on our site it seems a very rare (<0.1%) issue.

@solvaholic
Copy link

👋 In case it can help y'all isolate the cause in your case, one way these 503 can occur is if your running workflow jobs separately upload the same artifact name and path.

The risk of that is described in this project's README under Uploading to the same artifact:

Each artifact behaves as a file share. Uploading to the same artifact multiple times in the same workflow can overwrite and append already uploaded files:

@joshuapinter
Copy link

@solvaholic Yoooooo! I think that is our exact issue. We were uploading logs from multiple jobs with an artifact name that used the github.run_id, which is shared amongst all of the jobs. So if two or more jobs uploaded artifacts, they would be uploading artifacts with the same name. When done non-concurrently, this seemed to be just fine (I think). But perhaps when done concurrently, this was causing us to get 50x issues because the artifacts were becoming corrupt or having a conflict with two jobs writing to the same artifact/temp file.

I'm not 100% sure on this but your comment made me look at this again and think that this could be the issue. We're going to test out a solution where we use the github.job_id instead of the github.run_id to scope our log files per job and see if this issue reoccurs. I'll try and remember to post back here the results.

Thanks for commenting this and linking to that warning in the README. 🙏

@ncdc
Copy link

ncdc commented May 3, 2022

@konradpabjan I just found your comment at #84 (comment). We're getting 500 errors fairly regularly. Up until today, we did have 3 jobs that were all sharing the same artifact name, which we realized was wrong. We have since fixed that bug in our workflow, but we are continuing to encounter 500 errors, such as in https://github.com/kcp-dev/kcp/runs/6264716798?check_suite_focus=true. Any chance you could take a look? Thanks!

@juhhov
Copy link

juhhov commented May 9, 2022

We are also suffering from these errors. Ping @konradpabjan.

##### Begin Diagnostic HTTP information #####
Status Code: 500
Status Message: Internal Server Error
Header Information: {
"cache-control": "no-store,no-cache",
"pragma": "no-cache",
"content-length": "328",
"content-type": "application/json; charset=utf-8",
"strict-transport-security": "max-age=2592000",
"x-tfs-processid": "4139b173-84e2-4a2b-bee5-a6122834584d",
"activityid": "5bf8c2b6-1ee9-42ab-812c-9f2a9d4f57a5",
"x-tfs-session": "5bf8c2b6-1ee9-42ab-812c-9f2a9d4f57a5",
"x-vss-e2eid": "5bf8c2b6-1ee9-42ab-812c-9f2a9d4f57a5",
"x-vss-senderdeploymentid": "d624195d-30e0-1768-06a5-b10a7879c7db",
"x-frame-options": "SAMEORIGIN",
"x-cache": "CONFIG_NOCACHE",
"x-msedge-ref": "Ref A: F040C85679224F9294956B424D0ED853 Ref B: VIEEDGE2608 Ref C: 2022-05-06T10:56:59Z",
"date": "Fri, 06 May 2022 10:57:00 GMT"
}
###### End Diagnostic HTTP information ######

edit:
Another process had one of the files being uploaded open simultaneously. This at least makes the issue occur more frequently if not the root cause. The error message should be improved at least.

sushraju added a commit to muxinc/clickhouse-backup that referenced this issue Jun 27, 2022
* fix Altinity#311

* fix Altinity#312

* fix https://github.com/Altinity/clickhouse-backup/runs/4385266807

* fix wrong amd64 `libc` dependency

* change default skip_tables pattern to exclude INFORMATION_SCHEMA database for clickhouse 21.11+

* actualize GET /backup/actions, and fix config.go `CLICKHOUSE_SKIP_TABLES` definition

* add COS_DEBUG separate setting, wait details in Altinity#316

* try to resolve Altinity#317

* Allow using OIDC token for AWS credentials

* update ReadMe.md add notes about INFORMATION_SCHEMA.*

* fix Altinity#220, allow total_bytes as uint64 fetching
fix allocations for `tableMetadataForDownload`
fix getTableSizeFromParts behavior only for required tables
fix Error handling on some suggested cases

* fix Altinity#331, corner case when `Table`  and `Database` have the same name.
update clickhouse-go to 1.5.1

* fix Altinity#331

* add SFTP_DEBUG to try debug Altinity#335

* fix bug, recursuve=>recursive

* BackUPList use 'recursive=true', and other codes do not change, hope this can pass CI

* Force recursive equals true locally

* Reset recursive flag to false

* fix Altinity#111

* add inner Interface for COS

* properly fix for recursive delimiter, fix Altinity#338

* Fix bug about metadata.json, we should check the file name first, instead of appending metadata.json arbitrary

* add ability to restore schema ON CLUSTER, fix Altinity#145

* fix bug about clickhouse-backup list remote which shows no backups info, clickhouse-backup create_remote which will not delete the right backups

* fix `Address: NULL pointer` when DROP TABLE ... ON CLUSTER, fix Altinity#145

* try to fix `TestServerAPI` https://github.com/Altinity/clickhouse-backup/runs/4727526265

* try to fix `TestServerAPI` https://github.com/Altinity/clickhouse-backup/runs/4727754542

* Give up using metaDataFilePath variable

* fix bug

* Add support encrypted disk (include s3 encrypted disks), fix [Altinity#260](Altinity#260)
add 21.12 to test matrix
fix FTP MkDirAll behavior
fix `restore --rm` behavior for 20.12+ for tables which have dependent objects (like dictionary)

* try to fix failed build https://github.com/Altinity/clickhouse-backup/runs/4749276032

* add S3 only disks check for 21.8+

* fix Altinity#304

* fix Altinity#309

* try return GCP_TESTS back

* fix run GCP_TESTS

* fix run GCP_TESTS, again

* split build-artifacts and build-test-artifacts

* try to fix https://github.com/Altinity/clickhouse-backup/runs/4757549891

* debug workflows/build.yaml

* debug workflows/build.yaml

* debug workflows/build.yaml

* final download atrifacts for workflows/build.yaml

* fix build docker https://github.com/Altinity/clickhouse-backup/runs/4758167628

* fix integration_tests https://github.com/AlexAkulov/clickhouse-backup/runs/4758357087

* Improve list remote speed via local metadata cache, fix Altinity#318

* try to fix https://github.com/Altinity/clickhouse-backup/runs/4763790332

* fix test after fail https://github.com/Altinity/clickhouse-backup/runs/4764141333

* fix concurrency MkDirAll for FTP remote storage, improve `invalid compression_format` error message

* fix TestLongListRemote

* Clean code, do not name variables so sloppily, names should be meaningful

* Update clickhouse.go

Change partitions => part

* Not change Files filed in json file

* Code should be placed in proper position

* Update server.go

* fix bug

* Invoke SoftSelect should begin with ch.

* fix error, clickhouse.common.TablePathEncode => common.TablePathEncode

* refine code

* try to commit

* fix bug

* Remove unused codes

* Use NewReplacer

* Add `CLICKHOUSE_IGNORE_NOT_EXISTS_ERROR_DURING_FREEZE`, fix Altinity#319

* fix test fail https://github.com/Altinity/clickhouse-backup/runs/4825973411?check_suite_focus=true

* run only TestSkipNotExistsTable on Github actions

* try to fix TestSkipNotExistsTable

* try to fix TestSkipNotExistsTable

* try to fix TestSkipNotExistsTable, for ClickHouse version v1.x

* try to fix TestSkipNotExistsTable, for ClickHouse version v1.x

* add microseconds to log, try to fix TestSkipNotExistsTable, for ClickHouse version v20.8

* add microseconds to log, try to fix TestSkipNotExistsTable, for ClickHouse version v20.8

* fix connectWithWait, some versions of clickhouse accept connections during process /entrypoint-initdb.d, need wait to continue

* add TestProjections

* rename dropAllDatabases to more mental and clear name

* skip TestSkipNotExistsTable

* Support specified partition backup (Altinity#356)

* Support specify partition during backup create

Authored-by: wangzhen <[email protected]>

* fix PROJECTION restore Altinity#320

* fix TestProjection fail after https://github.com/Altinity/clickhouse-backup/actions/runs/1712868840

* switch to `altinity-qa-test` bucket in GCS test

* update github.com/mholt/archiver/v3 and github.com/ClickHouse/clickhouse-go to latest version, remove old github.com/mholt/archiver usage

* fix `How to convert MergeTree to ReplicatedMergeTree` instruction

* fix `FTP` connection usage in MkDirAll

* optimize ftp.go connection pool

* Add `UPLOAD_BY_PART` config settings for improve upload/download concurrency fix Altinity#324

* try debug https://github.com/AlexAkulov/clickhouse-backup/runs/4920777422

* try debug https://github.com/AlexAkulov/clickhouse-backup/runs/4920777422

* fix vsFTPd 500 OOPS: vsf_sysutil_bind, maximum number of attempts to find a listening port exceeded, fix https://github.com/AlexAkulov/clickhouse-backup/runs/4921182982

* try to fix race condition in GCP https://github.com/AlexAkulov/clickhouse-backup/runs/4924432841

* update clickhouse-go to 1.5.3, properly handle `--schema` parameter for show local backup size after `download`

* add `Database not exists` corner case for `IgnoreNotExistsErrorDuringFreeze` option

* prepare release 1.3.0
- Add implementation `--diff-from-remote` for `upload` command and properly handle `required` on download command, fix Altinity#289
- properly `REMOTE_STORAGE=none` error handle, fix Altinity#375
- Add support for `--partitions` on create, upload, download, restore CLI commands and API endpoint fix Altinity#378, properly implementation of Altinity#356
- Add `print-config` cli command fix Altinity#366
- API Server optimization for speed of `last_backup_size_remote` metric calculation to make it async during REST API startup and after download/upload, fix Altinity#309
- Improve `list remote` speed via local metadata cache in `$TEMP/.clickhouse-backup.$REMOTE_STORAGE`, fix Altinity#318
- fix Altinity#375, properly `REMOTE_STORAGE=none` error handle
- fix Altinity#379, will try to clean `shadow` if `create` fail during `moveShadow`
- more precise calculation backup size during `upload`, for backups created with `--partitions`, fix bug after Altinity#356
- fix `restore --rm` behavior for 20.12+ for tables which have dependent objects (like dictionary)
- fix concurrency by `FTP` creation directories during upload, reduce connection pool usage
- properly handle `--schema` parameter for show local backup size after `download`
- add ClickHouse 22.1 instead of 21.12 to test matrix

* fix build https://github.com/Altinity/clickhouse-backup/runs/5033550335

* Add `API_ALLOW_PARALLEL` to support multiple parallel execution calls for, WARNING, control command names don't try to execute multiple same commands and be careful, it could allocate much memory during upload / download, fix Altinity#332

* apt-get update too slow today on github ;(

* fix TestLongListRemote

* fix Altinity#340, properly handle errors on S3 during Walk() and delete old backup

* Add TestFlows tests to GH workflow (Altinity#5)

* add actions tests

* Update test.yaml

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* updated

* added config.yml

* added config.yml

* update

* updated files

* added tests for views

* updated tests

* updated

* fixed snapshots

* updated tests in response to @Slach

* upload new stuff

* rerun

* fix

* fix

* remove file

* added requirements

* fix fails

* ReRun actions

* Moved credentials

* added secrets

* ReRun actions

* Edited test.yaml

* Edited test.yaml

* ReRun actions

* removed TE flag

* update

* update

* update

* fix type

* update

* try to reanimate ugly github actions and ugly python tests

* try to reanimate ugly config_rbac.py

* fix Altinity#300
fix WINDOW VIEW restore
fix restore for different compression_format than backup created
fix most of xfail in regression.py
merge test.yaml and build.yaml in github actions
Try to add experimental support for backup `MaterializedMySQL` and `MaterializedPostgeSQL` tables, restore MySQL tables not impossible now without replace `table_name.json` to `Engine=MergeTree`, PostgreSQL not supported now, see ClickHouse/ClickHouse#32902

* return format back

* fix build.yaml after https://github.com/Altinity/clickhouse-backup/actions/runs/1800312966

* fix build.yaml after https://github.com/Altinity/clickhouse-backup/actions/runs/1800312966

* build fixes after https://github.com/Altinity/clickhouse-backup/runs/5079597138

* build fixes after https://github.com/Altinity/clickhouse-backup/runs/5079630559

* build fixes after https://github.com/Altinity/clickhouse-backup/runs/5079669062

* fix tfs report

* fix upload artifact for tfs report

* fix upload artifact for clickhouse logs, remove unused BackupOptions

* suuka

* fix upload `clickhouse-logs` artifacts and tfs `report.html`

* fix upload `clickhouse-logs` artifacts

* fix upload `clickhouse-logs` artifacts, fix tfs reports

* fix tfs reports

* change retention to allow upload-artifacts work

* fix ChangeLog.md

* skip gcs and aws remote storage tests if secrets not set

* remove short output

* increase timeout to allow download images during pull

* remove upload `tesflows-clickhouse-logs` artifacts to avoid 500 error

* fix upload_release_assets action for properly support arm64

* switch to mantainable `softprops/action-gh-release`

* fix Unexpected input(s) 'release_name'

* move internal, config, util into `pkg` refactoring

* updated test requirements

* refactoring `filesystemhelper.Chown` remove unnecessary getter/setter, try to reproduce access denied for Altinity#388 (comment)

* resolve Altinity#390, for 1.2.3 hotfix branch

* backport 1.3.x Dockerfile and Makefile to allow 1.2.3 docker ARM support

* fix Altinity#387 (comment), improve documentation related to memory and CPU usage

* fix Altinity#388, improve restore ON CLUSTER for VIEW with TO clause

* fix Altinity#388, improve restore ATTACH ... VIEW ... ON CLUSTER, GCS golang sdk updated to latest

* fix Altinity#385, properly handle multiple incremental backup sequences + `BACKUPS_TO_KEEP_REMOTE`

* fix Altinity#392, correct download for recursive sequence of diff backups when `DOWNLOAD_BY_PART` true
fix integration_test.go, add RUN_ADVANCED_TESTS environment, fix minio_nodelete.sh

* try to reduce upload artifact jobs, look actions/upload-artifact#171 and https://github.com/Altinity/clickhouse-backup/runs/5229552384?check_suite_focus=true

* try to docker-compose up from first time https://github.com/AlexAkulov/clickhouse-backup/runs/5231510719?check_suite_focus=true

* disable telemetry for GCS related to googleapis/google-cloud-go#5664

* update aws-sdk-go and GCS storage SDK

* DROP DATABASE didn't clean S3 files, DROP TABLE clean!

* - fix Altinity#406, properly handle `path` for S3, GCS for case when it begin from "/"

* fix getTablesWithSkip

* fix Altinity#409

* cherry pick release.yaml from 1.3.x to 1.2.x

* fix Altinity#409, for 1.3.x avoid delete partially uploaded backups via `backups_keep_remote` option

* Updated requirements file

* fix Altinity#409, for 1.3.x avoid delete partially uploaded backups via `backups_keep_remote` option

* fix testflows test

* fix testflows test

* restore tests after update minio

* Fix incorrect in progress check on the example of Kubernetes CronJob

* removeOldBackup error log from fatal to warning, to avoid race-condition deletion during multi-shard backup

* switch to golang 1.18

Signed-off-by: Slach <[email protected]>

* add 22.3 to test matrix, fix Altinity#422, avoid cache broken (partially uploaded) remote backup metadata.

* add 22.3 to test matrix

* fix Altinity#404, switch to 22.3 by default

Signed-off-by: Slach <[email protected]>

* fix Altinity#404, update to archiver/v4, properly support context during upload / download and correct error handler, reduce `SELECT * system.disks` calls

Signed-off-by: Slach <[email protected]>

* cleanup ChangeLog.md, finally before 1.3.2 release

Signed-off-by: Slach <[email protected]>

* continue fix Altinity#404

Signed-off-by: Slach <[email protected]>

* continue fix Altinity#404, properly calculate max_parts_count

Signed-off-by: Slach <[email protected]>

* continue fix Altinity#404, properly calculate max_parts_count

Signed-off-by: Slach <[email protected]>

* add multithreading GZIP implementation

Signed-off-by: Slach <[email protected]>

* add multithreading GZIP implementation

Signed-off-by: Slach <[email protected]>

* add multithreading GZIP implementation

Signed-off-by: Slach <[email protected]>

* Updated Testflows README.md

* add `S3_ALLOW_MULTIPART_DOWNLOAD` to config, to improve download speed, fix Altinity#431

Signed-off-by: Slach <[email protected]>

* fix snapshot after change default config

Signed-off-by: Slach <[email protected]>

* fix testflows healthcheck for slow internet connection during `clickhouse_backup` start

Signed-off-by: Slach <[email protected]>

* fix snapshot after change defaultConfig

Signed-off-by: Slach <[email protected]>

* - add support backup/restore user defined functions https://clickhouse.com/docs/en/sql-reference/statements/create/function, fix Altinity#420

Signed-off-by: Slach <[email protected]>

* Updated README.md in testflows tests

* remove unnecessary SQL query for calculateMaxSize, refactoring test to allow restoreRBAC with restart on 21.8 (strange bug, clickhouse stuck after try to run too much distributed DDL queries from ZK), update LastBackupSize metric during API call /list/remote, add healthcheck to docker-compose in integration tests

Signed-off-by: Slach <[email protected]>

* try to fix GitHub actions

Signed-off-by: Slach <[email protected]>

* try to fix GitHub actions, WTF, why testflows failed?

Signed-off-by: Slach <[email protected]>

* add `clickhouse_backup_number_backups_remote`, `clickhouse_backup_number_backups_local`, `clickhouse_backup_number_backups_remote_expected`,`clickhouse_backup_number_backups_local_expected` prometheus metric, fix Altinity#437

Signed-off-by: Slach <[email protected]>

* add ability to apply `system.macros` values to `path` field in all types of `remote_storage`, fix Altinity#438

Signed-off-by: Slach <[email protected]>

* use all disks for upload and download for mutli-disk volumes in parallel when `upload_by_part: true` fix Altinity#400

Signed-off-by: Slach <[email protected]>

* fix wrong warning for .gz, .bz2, .br archive extensions during download, fix Altinity#441

Signed-off-by: Slach <[email protected]>

* fix Altinity#441, again ;(

Signed-off-by: Slach <[email protected]>

* try to improve strange parts long tail during test

Signed-off-by: Slach <[email protected]>

* update actions/download-artifact@v3 and actions/upload-artifact@v2, after actions fail

Signed-off-by: Slach <[email protected]>

* downgrade actions/[email protected], actions/upload-artifact#270, after actions fail https://github.com/AlexAkulov/clickhouse-backup/runs/6481819375

Signed-off-by: Slach <[email protected]>

* fix upload data go routines wait, expect improve upload speed the same as 1.3.2

Signed-off-by: Slach <[email protected]>

* prepare 1.4.1

Signed-off-by: Slach <[email protected]>

* Fix typo in Example.md

* Set default value for max_parts_count in Azure config

* fix `--partitions` parameter parsing, fix Altinity#425

Signed-off-by: Slach <[email protected]>

* remove unnecessary logs, fix release.yaml to mark properly tag in GitHub release

Signed-off-by: Slach <[email protected]>

* add `API_INTEGRATION_TABLES_HOST` option to allow use DNS name in integration tables system.backup_list, system.backup_actions

Signed-off-by: Slach <[email protected]>

* add `API_INTEGRATION_TABLES_HOST` fix for tesflows fails

Signed-off-by: Slach <[email protected]>

* fix `upload_by_part: false` max file size calculation, fix Altinity#454

* upgrade actions/upload-artifact@v3, actions/upload-artifact#270, after actions fail https://github.com/Altinity/clickhouse-backup/runs/6962550621

* [clickhouse-backup] fixes on top of upstream

* upstream versions

Co-authored-by: Slach <[email protected]>
Co-authored-by: Vilmos Nebehaj <[email protected]>
Co-authored-by: Eugene Klimov <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: wangzhen <[email protected]>
Co-authored-by: W <[email protected]>
Co-authored-by: Andrey Zvonov <[email protected]>
Co-authored-by: zvonand <[email protected]>
Co-authored-by: benbiti <[email protected]>
Co-authored-by: Vitaliis <[email protected]>
Co-authored-by: Toan Nguyen <[email protected]>
Co-authored-by: Guido Iaquinti <[email protected]>
Co-authored-by: ricoberger <[email protected]>
shymega added a commit to shymega/input-leap that referenced this issue Jan 11, 2023
This PR fixes the artifact upload error from Azure (HTTP 503), causing
the check to fail.

We now upload artifacts in the format of `windows-{ver}-Release`. This
makes the artifact name *unique*.

The fix is inspired by: actions/upload-artifact#171 (comment)
shymega added a commit to shymega/input-leap that referenced this issue Jan 11, 2023
This PR fixes the artifact upload error from Azure (HTTP 503), causing
the check to fail.

We now upload artifacts in the format of `windows-{ver}-Release`. This
makes the artifact name *unique*.

The fix is inspired by: actions/upload-artifact#171 (comment)
shymega added a commit to shymega/input-leap that referenced this issue Jan 11, 2023
This PR fixes the artefact upload error from Azure (HTTP 503), causing
the check to fail.

We now upload the ZIP artefact in the format of `windows-{ver}-Release`.
This makes the artefact name *unique*.

The fix is inspired by: actions/upload-artifact#171 (comment)
shymega added a commit to shymega/input-leap that referenced this issue Jan 11, 2023
This is the first commit in this PR that fixes the `upload-artifact` error
from Azure (HTTP 503), causing the check to fail.

We now upload the ZIP artefact in the format of `${windows-ver}-Release`.
This makes the artefact name *unique*, and solves the first problem.

The fix is inspired by: actions/upload-artifact#171 (comment)
shymega added a commit to shymega/input-leap that referenced this issue Jan 11, 2023
This is the first commit in this PR that fixes the `upload-artifact` error
from Azure (HTTP 503), causing the check to fail.

We now upload the ZIP artefact in the format of `${windows-ver}-Release`.
This makes the artefact name *unique*, and solves the first problem.

The fix is inspired by: actions/upload-artifact#171 (comment)
@konradpabjan
Copy link
Collaborator

v4 has shipped today https://github.blog/changelog/2023-12-14-github-actions-artifacts-v4-is-now-generally-available/

I recommend switching over as these classes of issues should no longer happen with the release

@nicola-lunghi
Copy link

v4 has shipped today https://github.blog/changelog/2023-12-14-github-actions-artifacts-v4-is-now-generally-available/

I recommend switching over as these classes of issues should no longer happen with the release

Would be nice to be able to do it but impossible on github enterprise....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants