-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcdserver: don't activate alarm w/missing AlarmType #13084
Conversation
@@ -366,6 +366,21 @@ func testV3CurlResignMissiongLeaderKey(cx ctlCtx) { | |||
} | |||
} | |||
|
|||
func TestV3CurlMaintenanceAlarmMissiongAlarm(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Missiong" is how it's spelled in all the other tests in this file, so I decided to be consistent rather than correct 🤷
Codecov Report
@@ Coverage Diff @@
## main #13084 +/- ##
==========================================
+ Coverage 53.33% 61.86% +8.53%
==========================================
Files 420 415 -5
Lines 33368 32911 -457
==========================================
+ Hits 17796 20360 +2564
+ Misses 13739 10411 -3328
- Partials 1833 2140 +307
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Nice! Was there any tooling involved for fuzzing? |
@dlowe Thanks for the fix. Two things to address before merging this.
lg.Warn("alarm raised", zap.String("alarm", m.Alarm.String()), zap.String("from", types.ID(m.MemberID).String()))
switch m.Alarm {
case pb.AlarmType_CORRUPT:
a.s.applyV3 = newApplierV3Corrupt(a)
case pb.AlarmType_NOSPACE:
a.s.applyV3 = newApplierV3Capped(a)
default:
lg.Warn("unimplemented alarm activation", zap.String("alarm", fmt.Sprintf("%+v", m)))
} to lg.Panic("unimplemented alarm activation", zap.String("alarm", fmt.Sprintf("%+v", m))) |
Narrowly prevent etcd from crashing when given a bad ACTIVATE payload, e.g.: $ curl -d "{\"action\":\"ACTIVATE\"}" ${ETCD}/v3/maintenance/alarm curl: (52) Empty reply from server
eeb3402
to
a26fa0c
Compare
Mayhem for API itself is a command-line tool ( Other than that, nope! I just ran
Yup! Both done. |
This is a crashing bug I found while fuzzing etcd under Mayhem for API. Full disclosure: working on this is my day job!
On a fresh-out-of-the-box
bin/etcd
:The server has crashed hard, with:
(And in fact fails with the same exception on every subsequent restart until the offending alarm is cleaned out of the data directory!)
The test I've added is at the same level (http request from the outside world) where I discovered the bug, because I'm not super familiar with the etcd codebase. Let me know if it'd be more appropriate to add a narrower unit test.
I tried to keep my fix in line with the existing behavior of
applierV3backend.Alarm()
, which is to say it turns bad requests into silent noops.Finally: I've found at least one other crashing bug through fuzzing, as well as other, less severe issues. Please let me know if you don't want further PRs based on these findings!