Skip to content

[common] Fix integer overflow error in JitteredBackOffStrategy found by fuzzer.#10417

Merged
mattklein123 merged 3 commits intoenvoyproxy:masterfrom
antoniovicente:backoff_overflow
Mar 18, 2020
Merged

[common] Fix integer overflow error in JitteredBackOffStrategy found by fuzzer.#10417
mattklein123 merged 3 commits intoenvoyproxy:masterfrom
antoniovicente:backoff_overflow

Conversation

@antoniovicente
Copy link
Contributor

Description: Fix integer overflow in JitteredBackOffStrategy::nextBackOffMs found by fuzzing. After 32 retries, multiplier becomes negative. Of course, this is unlikely to ever happen in production, since it would take about 100 days before we'ld hit 32 retries with base_interval of 1ms and max_interval of uint64_t max.
Risk Level: n/a
Testing: unit
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Antonio Vicente <avd@google.com>
@mattklein123 mattklein123 self-assigned this Mar 17, 2020
…ng the exponential backoff. The multiplers used to be 2**N-1 while now the multiplers are 2**(N-1)

Signed-off-by: Antonio Vicente <avd@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. 1 question.

/wait-any

retry_timer_ = new Event::MockTimer(&dispatcher_);
EXPECT_CALL(*retry_timer_, enableTimer(std::chrono::milliseconds(24), _));
EXPECT_CALL(random_, random()).WillOnce(Return(190));
retry_timer_ = new testing::StrictMock<Event::MockTimer>(&dispatcher_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we have StrictMock as default in our test runs, so this shouldn't be needed AFAIK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this was left over from earlier attempts to debug the test failures.

EXPECT_CALL(random_, random()).WillOnce(Return(49));
retry_timer_ = new Event::MockTimer(&dispatcher_);
EXPECT_CALL(*retry_timer_, enableTimer(std::chrono::milliseconds(24), _));
EXPECT_CALL(random_, random()).WillOnce(Return(190));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: why does this test need to change as well as the HDS test? It seems like this fix shouldn't have any behavior change in the normal case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is related to this part of the original implementation that traces back to the first revision of Envoy in github:
uint32_t multiplier = (1 << current_retry_) - 1;

This is not increasing backoff by a factor of 2 each time. I think that what was really intended was:
uint32_t multiplier = 1 << (current_retry_ - 1);

Current backoff sequence: 1, 3, 7, 15...
New backoff sequence: 1, 2, 4, 8, 16...
The biggest difference is the jump from increase from 1 to 3x instead of 2x at the first step. Replicating the current behavior is possible, but the code would be slightly more complicated.

Some discussion about it here:
#3791 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK got it, makes sense, thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, thanks for digging deeper on unexplained code changes.

Signed-off-by: Antonio Vicente <avd@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mattklein123 mattklein123 merged commit 6488977 into envoyproxy:master Mar 18, 2020
@antoniovicente antoniovicente deleted the backoff_overflow branch March 18, 2020 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants