Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single broken entity breaks the whole batch #104

Closed
linki opened this issue Mar 17, 2017 · 11 comments
Closed

Single broken entity breaks the whole batch #104

linki opened this issue Mar 17, 2017 · 11 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Milestone

Comments

@linki
Copy link
Member

linki commented Mar 17, 2017

If a single dns record cannot be created, e.g. when a desired DNS name doesn't match the managed zone it will fail and the whole batch will be rolled back. We should tolerate individual records failing.

@linki linki modified the milestones: v1.0, v0.4 Apr 3, 2017
@linki linki added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Apr 10, 2017
@linki linki modified the milestones: post-v0.3-stabilization, v1.0 Jun 12, 2017
@joshrosso
Copy link

Hi @linki

Some folks I work with (or myself) may have interest in solving this problem as it's something we run into.

I'm curious on a few points, if you'd know off hand.

  • Does this issue exist for all providers? Or just AWS?
  • Is our preferred approach to attempt to use batch still? But fall back to single records if batch fails? Or perhaps reattempt batches with the 'tainted' record removed?

@linki
Copy link
Member Author

linki commented Oct 2, 2018

Ho @joshrosso,

It depends whether the provider implementation in ExternalDNS uses batch logic (e.g. AWS and Google do and DigitalOcean doesn't, afaik) and then how individual failure is handled (e.g. ignore the failure, stop any further processissing, rollback everything).

If we could identify the broken record from the error message we should - like you say - ignore it and optionally taint it so it doesn't get retried in the next iteration.

If we cannot identify the record we could use a more generic strategy - like you say as well - stop using batch and apply records individually and ignore the failing ones or - a bit more efficient - split the batches in half recursively and apply them until only the smallest failing ones are left.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 25, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 25, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@linki
Copy link
Member Author

linki commented Jun 27, 2019

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Jun 27, 2019
@k8s-ci-robot
Copy link
Contributor

@linki: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@linki
Copy link
Member Author

linki commented Jun 28, 2019

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

4 participants