Skip to content

Fix broken update handling of path-based routes#21877

Merged
openshift-merge-robot merged 1 commit intoopenshift:release-3.11from
ironcladlou:bz-1670072
Jan 29, 2019
Merged

Fix broken update handling of path-based routes#21877
openshift-merge-robot merged 1 commit intoopenshift:release-3.11from
ironcladlou:bz-1670072

Conversation

@ironcladlou
Copy link
Contributor

When an existing route's path is updated to a new value not previously observed,
the new route state is correctly activated, but the previous route is not
replaced in the hostRules internal active list. Instead, the updated route
state (with a different path but duplicate UID) is appended to the active list.
A hostRules active list with more than one route instance sharing a UID and
host is an invalid internal state. The net effect is some unintended confusing
behavior which manifests when updating the route's path, including:

  • Updating from path A->B->A preventing future transitions to B

  • Updating from path A->B->A->B leaves an orphaned claim on the host in the
    namespace, preventing the recreation of the route with either of the previously
    observed paths until the router is restarted (causing the internal state to be
    rebuilt)

Routes updates which are affected by this bug will be rejected with a status
like:

message: route foo already exposes foo-ns.os.example.com and is older
reason: HostAlreadyClaimed

Where foo refers to itself. This indicates the route update was rejected due
to a claim made by the same route being updated.

To fix the problem, ensure that when a path change is detected on the same route
the prior existing state is removed from the index, eliminating the possibility
of keeping an orphaned claim. Then add the new route as usual, which handles
acceptance/rejection normally.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1660598

When an existing route's path is updated to a new value not previously observed,
the new route state is correctly activated, but the previous route is not
replaced in the `hostRules` internal active list. Instead, the updated route
state (with a different path but duplicate UID) is appended to the active list.
A `hostRules` active list with more than one route instance sharing a UID and
host is an invalid internal state. The net effect is some unintended confusing
behavior which manifests when updating the route's path, including:

  * Updating from path A->B->A preventing future transitions to B

  * Updating from path A->B->A->B leaves an orphaned claim on the host in the
namespace, preventing the recreation of the route with either of the previously
observed paths until the router is restarted (causing the internal state to be
rebuilt)

Routes updates which are affected by this bug will be rejected with a status
like:

    message: route foo already exposes foo-ns.os.example.com and is older
    reason: HostAlreadyClaimed

Where `foo` refers to itself. This indicates the route update was rejected due
to a claim made by the same route being updated.

To fix the problem, ensure that when a path change is detected on the same route
the prior existing state is removed from the index, eliminating the possibility
of keeping an orphaned claim. Then add the new route as usual, which handles
acceptance/rejection normally.
@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 28, 2019
@ironcladlou
Copy link
Contributor Author

/cc @openshift/sig-network-edge @smarterclayton

@knobunc
Copy link
Contributor

knobunc commented Jan 28, 2019

/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 28, 2019
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ironcladlou, knobunc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 28, 2019
@ironcladlou
Copy link
Contributor Author

/cherry-pick 3.10

@ironcladlou
Copy link
Contributor Author

/cherry-pick release-3.10

@ironcladlou
Copy link
Contributor Author

/test extended_conformance_install

@ironcladlou
Copy link
Contributor Author

/retest

@ironcladlou
Copy link
Contributor Author

TASK [Run variable sanity checks] **********************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/init/sanity_checks.yml:14
fatal: [localhost]: FAILED! => {
    "generated_timestamp": "2019-01-28 16:16:08.680131", 
    "msg": "last_checked_host: localhost, last_checked_var: openshift_master_identity_providers;Found removed variables: openshift_node_labels is replaced by openshift_node_groups[<item>].labels; "
}
 [WARNING]: Could not create retry file '/usr/share/ansible/openshift-
ansible/playbooks/prerequisites.retry'.         [Errno 13] Permission denied:
u'/usr/share/ansible/openshift-ansible/playbooks/prerequisites.retry'

No idea on this one...

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ironcladlou
Copy link
Contributor Author

/retest

1 similar comment
@ironcladlou
Copy link
Contributor Author

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ironcladlou
Copy link
Contributor Author

/retest

3 similar comments
@ironcladlou
Copy link
Contributor Author

/retest

@ironcladlou
Copy link
Contributor Author

/retest

@ironcladlou
Copy link
Contributor Author

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ironcladlou
Copy link
Contributor Author

CI errors possibly fixed by openshift-eng/aos-cd-jobs#1688

@sdodson
Copy link
Member

sdodson commented Jan 29, 2019

/retest

@openshift-merge-robot openshift-merge-robot merged commit b750162 into openshift:release-3.11 Jan 29, 2019
@stevekuznetsov
Copy link
Contributor

/cherrypick release-3.10

@openshift-cherrypick-robot

@stevekuznetsov: new pull request created: #21901

Details

In response to this:

/cherrypick release-3.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. sig/network-edge size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants