Single broken entity breaks the whole batch #104

linki · 2017-03-17T13:59:34Z

If a single dns record cannot be created, e.g. when a desired DNS name doesn't match the managed zone it will fail and the whole batch will be rolled back. We should tolerate individual records failing.

joshrosso · 2017-10-13T22:14:55Z

Hi @linki

Some folks I work with (or myself) may have interest in solving this problem as it's something we run into.

I'm curious on a few points, if you'd know off hand.

Does this issue exist for all providers? Or just AWS?
Is our preferred approach to attempt to use batch still? But fall back to single records if batch fails? Or perhaps reattempt batches with the 'tainted' record removed?

linki · 2018-10-02T09:23:57Z

Ho @joshrosso,

It depends whether the provider implementation in ExternalDNS uses batch logic (e.g. AWS and Google do and DigitalOcean doesn't, afaik) and then how individual failure is handled (e.g. ignore the failure, stop any further processissing, rollback everything).

If we could identify the broken record from the error message we should - like you say - ignore it and optionally taint it so it doesn't get retried in the next iteration.

If we cannot identify the record we could use a more generic strategy - like you say as well - stop using batch and apply records individually and ignore the failing ones or - a bit more efficient - split the batches in half recursively and apply them until only the smallest failing ones are left.

fejta-bot · 2019-04-25T22:15:12Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2019-05-25T22:57:40Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-06-24T23:48:17Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-06-24T23:48:24Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

linki · 2019-06-27T09:23:00Z

/reopen

k8s-ci-robot · 2019-06-27T09:23:02Z

@linki: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

linki · 2019-06-28T09:58:23Z

Collecting the related issues:

bug report with first batch size feature: Fix for #368 #374
another bug report: external-dns stops updating route53 on error #421
another bug report and fix: AWS Route53 - InvalidChangeBatch: RDATA character limit of 32000 exceeded. #515 Add aws max change count flag #596
another bug report: create/upsert records fail if too many AWS records to apply #731

fejta-bot · 2019-07-28T10:53:07Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-07-28T10:53:15Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

linki modified the milestones: v1.0, v0.4 Apr 3, 2017

linki added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Apr 10, 2017

linki modified the milestones: post-v0.3-stabilization, v1.0 Jun 12, 2017

This was referenced Sep 26, 2018

When reading ALIAS records, also read the value of EvaluateTargetHealth #720

Closed

Refrain from deleting DNS records temporarily zalando-incubator/kubernetes-on-aws#1445

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 25, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 25, 2019

k8s-ci-robot closed this as completed Jun 24, 2019

k8s-ci-robot reopened this Jun 27, 2019

linki mentioned this issue Jun 27, 2019

ExternalDNS: reduce Route53 batch size to 100 zalando-incubator/kubernetes-on-aws#2266

Merged

k8s-ci-robot closed this as completed Jul 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single broken entity breaks the whole batch #104

Single broken entity breaks the whole batch #104

linki commented Mar 17, 2017 •

edited

Loading

joshrosso commented Oct 13, 2017

linki commented Oct 2, 2018 •

edited

Loading

fejta-bot commented Apr 25, 2019

fejta-bot commented May 25, 2019

fejta-bot commented Jun 24, 2019

k8s-ci-robot commented Jun 24, 2019

linki commented Jun 27, 2019

k8s-ci-robot commented Jun 27, 2019

linki commented Jun 28, 2019

fejta-bot commented Jul 28, 2019

k8s-ci-robot commented Jul 28, 2019

Single broken entity breaks the whole batch #104

Single broken entity breaks the whole batch #104

Comments

linki commented Mar 17, 2017 • edited Loading

joshrosso commented Oct 13, 2017

linki commented Oct 2, 2018 • edited Loading

fejta-bot commented Apr 25, 2019

fejta-bot commented May 25, 2019

fejta-bot commented Jun 24, 2019

k8s-ci-robot commented Jun 24, 2019

linki commented Jun 27, 2019

k8s-ci-robot commented Jun 27, 2019

linki commented Jun 28, 2019

fejta-bot commented Jul 28, 2019

k8s-ci-robot commented Jul 28, 2019

linki commented Mar 17, 2017 •

edited

Loading

linki commented Oct 2, 2018 •

edited

Loading