Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd3 cluster filled disk and didn't gc old revisions #7944

Closed
Spindel opened this issue May 17, 2017 · 34 comments
Closed

etcd3 cluster filled disk and didn't gc old revisions #7944

Spindel opened this issue May 17, 2017 · 34 comments
Labels

Comments

@Spindel
Copy link

Spindel commented May 17, 2017

Bug reporting

our kubernetes cluster stopped taking commands due to etcd3 running out of space and alarming about NOSPACE.

endpoint status, as json:
[{"Endpoint":"coreos01.kub.modio.se:2379","Status":{"header":{"cluster_id":1416696078359407448,"member_id":5715034692686022789,"revision":2190654,"raft_term":5829},"version":"3.0.10","dbSize":952512512,"leader":12583913577278567278,"raftIndex":5817555,"raftTerm":5829}},{"Endpoint":"coreos01.kub.do.modio.se:2379","Status":{"header":{"cluster_id":1416696078359407448,"member_id":9911998787147328850,"revision":2190654,"raft_term":5829},"version":"3.0.10","dbSize":356352,"leader":12583913577278567278,"raftIndex":5817555,"raftTerm":5829}},{"Endpoint":"coreos03.kub.do.modio.se:2379","Status":{"header":{"cluster_id":1416696078359407448,"member_id":12583913577278567278,"revision":2190654,"raft_term":5829},"version":"3.0.10","dbSize":3777458176,"leader":12583913577278567278,"raftIndex":5817555,"raftTerm":5829}}]

After that we did a compaction:

compaction 2190654
compacted revision 2190654

after compact, before defrag:

coreos01.kub.modio.se:2379, 4f4fe1c600958885, 3.0.10, 952 MB, false, 5829, 5817819
coreos01.kub.do.modio.se:2379, 898e7e618204ed52, 3.0.10, 369 kB, false, 5829, 5817819
coreos03.kub.do.modio.se:2379, aea30ab9e1e7736e, 3.0.10, 3.8 GB, true, 5829, 5817821

after compact, after defrag:

coreos01.kub.modio.se:2379, 4f4fe1c600958885, 3.0.10, 299 kB, false, 5829, 5817955
coreos01.kub.do.modio.se:2379, 898e7e618204ed52, 3.0.10, 299 kB, false, 5829, 5817956
coreos03.kub.do.modio.se:2379, aea30ab9e1e7736e, 3.0.10, 299 kB, true, 5829, 5817957

And after clearing the alarms, we could use the kubernetes cluster again.

@Spindel
Copy link
Author

Spindel commented May 17, 2017

Note that in the output here, we had already manually compacted against one of the revisions, but not all before this.
The difference in size between the 369kb and the 952MiB is also quite stark.

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

@Spindel Do you do external periodical backup? How often does this happen?

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

is this a cluster migrated from etcd2? or it is a fresh cluster?

@Spindel
Copy link
Author

Spindel commented May 17, 2017

No external backups taken on this system, and this is the first time.

It's powering a small kubernetes cluster ( handful of pods only) and this is the first time it's filled up.

I'm no longer sure if it's migrated or not, honestly. It's been re-set quite a few times while I've been working with k8s.

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

No external backups taken on this system, and this is the first time.

good to know.

It's powering a small kubernetes cluster ( handful of pods only) and this is the first time it's filled up.

Can you keep on monitoring the database size growth? It should not grow even up to 100MB for your workload in theory.

Do you keep all your etcd log? I am interested in the etcd server log of coreos03.kub.do.modio.se:2379 in 48 hours before it filled up its db size quota.

@Spindel
Copy link
Author

Spindel commented May 17, 2017

I can keep monitoring it, yes.
And I kept that machine ( and only that machine ) without rebooting. Attaching a log.

etcd-member.zip

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

@Spindel Thanks a lot! We will check the log and keep you posted.

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

### DB size is 4.2MB on 5-01

"MESSAGE" : "2017-05-01 16:33:37.593953 I | etcdserver: wrote database snapshot out [total bytes: 4210688]"
"MESSAGE" : "2017-05-01 16:33:37.715981 I | rafthttp: database snapshot [index: 897664, to: 4f4fe1c600958885] sent out successfully"

### DB size is 16.8 MB on 5-14

"MESSAGE" : "2017-05-14 18:33:09.985720 I | etcdserver: wrote database snapshot out [total bytes: 16801792]"
"MESSAGE" : "2017-05-14 18:33:10.044068 I | rafthttp: database snapshot [index: 4891320, to: 864e9ff97f31db34] sent out successfully"

### A bunch of failed db sent happened... We do not know the DB size. But on the same day, db grew to 3.8GB

"MESSAGE" : "2017-05-17 14:52:16.507272 I | rafthttp: start to send database snapshot [index: 5786911, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 14:54:31.768820 W | rafthttp: database snapshot [index: 5786911, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 14:54:32.214706 I | rafthttp: start to send database snapshot [index: 5787406, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:08:01.517050 W | rafthttp: database snapshot [index: 5787406, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:08:40.996441 I | rafthttp: start to send database snapshot [index: 5791076, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:09:30.765033 W | rafthttp: database snapshot [index: 5791076, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:09:31.044130 I | rafthttp: start to send database snapshot [index: 5791358, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:14:07.135746 W | rafthttp: database snapshot [index: 5791358, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:17:37.918353 I | rafthttp: start to send database snapshot [index: 5793924, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:19:02.336803 W | rafthttp: database snapshot [index: 5793924, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:19:02.457658 I | rafthttp: start to send database snapshot [index: 5794375, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:19:39.992919 W | rafthttp: database snapshot [index: 5794375, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"

@heyitsanthony I am not sure if the db sent failure triggered the huge db size or the db huge db size triggered the snapshot sending failure. Probably we should check how etcd handles snapshot sent failure first.

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

@Spindel do you know what happened to peer "https://coreos02.kub.do.modio.se"?

@Spindel
Copy link
Author

Spindel commented May 17, 2017

@xiang90 : Yeah, it hit OOM and the OOM-killer started reaping random processes.
I don't have any logs left from that node ( it's been destroyed and re-provisioned since ) sadly.

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

Yeah, it hit OOM

Is it possible that etcd takes a lot of RAM because of the large snapshot? What is your RAM limit?

@Spindel
Copy link
Author

Spindel commented May 17, 2017

On that node? I think that was 1GiB, but there were some misbehaving auto-restarted pods on that node that ate RAM pretty aggressively.

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

@Spindel

coreos01.kub.modio.se:2379, 4f4fe1c600958885, 3.0.10, 952 MB, false, 5829, 5817819

do you still have the log on this member too?

@Spindel
Copy link
Author

Spindel commented May 17, 2017

Here you go:
coreos01.kub.modio.se.zip

@Spindel
Copy link
Author

Spindel commented May 17, 2017

Here's the monitoring graph of available memory on the coreos02 node.

coreos02-memory

As you can see, it was quite stable at 200Mib free until it wasn't, and then a big dive.
Not sure what caused it, and I wasn't intent on digging further right now, but the dip should be coinciding with other things in the logs, I'd guess.

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

@Spindel Ah... Glad that you are monitoring RSS. Can you also provide the RSS usage on machine
coreos01.kub.modio.se:2379 and coreos03.kub.do.modio.se:2379? Thanks!

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

From another member

### DB size is 5.2MB on 5-10

"MESSAGE" : "2017-05-10 02:50:27.585453 I | rafthttp: receiving database snapshot [index:3443235, from 864e9ff97f31db34] ..."
"MESSAGE" : "2017-05-10 02:50:27.848332 I | snap: saved database snapshot to disk [total bytes: 5287936]"
"MESSAGE" : "2017-05-10 02:50:27.848383 I | rafthttp: received and saved database snapshot [index: 3443235, from: 864e9ff97f31db34] successfully"


### A bunch of failed db sent happened... We do not know the DB size. But on the same day, db grew to 952MB

"MESSAGE" : "2017-05-17 15:07:44.038917 I | rafthttp: start to send database snapshot [index: 5790785, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:08:45.918368 W | rafthttp: database snapshot [index: 5790785, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:19:36.991778 I | rafthttp: start to send database snapshot [index: 5794558, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:20:04.962841 W | rafthttp: database snapshot [index: 5794558, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:22:04.264811 I | rafthttp: start to send database snapshot [index: 5795340, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:22:29.757255 I | rafthttp: start to send database snapshot [index: 5795455, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:22:55.727348 W | rafthttp: database snapshot [index: 5795455, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:22:56.020457 I | rafthttp: start to send database snapshot [index: 5795598, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:23:19.094247 I | rafthttp: start to send database snapshot [index: 5795715, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:24:33.002411 W | rafthttp: database snapshot [index: 5795598, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:24:33.016531 W | rafthttp: database snapshot [index: 5795340, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:24:41.389751 I | rafthttp: start to send database snapshot [index: 5796157, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:24:46.281904 W | rafthttp: database snapshot [index: 5795715, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:24:46.371952 I | rafthttp: start to send database snapshot [index: 5796178, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:44:34.083107 W | rafthttp: database snapshot [index: 5796157, to: 864e9ff97f31db34] failed to be sent out (stopped)"
"MESSAGE" : "2017-05-17 15:44:34.083139 W | rafthttp: database snapshot [index: 5796178, to: 864e9ff97f31db34] failed to be sent out (stopped)"

@heyitsanthony Interesting... Two nodes with large db size all tried to send snapshot to 864e9ff97f31db34 but failed. It seems to me the root cause is the snapshot sending.

@Spindel Being greedy... Logs from coreos01.kub.do.modio.se:2379?

@Spindel
Copy link
Author

Spindel commented May 17, 2017

Sadly it seems I didn't have monitoring set up on coreos01.kub.modio.se for RSS ( only ping and uptimes )

Attaching graph for coreos03
coreos03

@Spindel
Copy link
Author

Spindel commented May 17, 2017

And here are the logs for coreos01.kub.do.modio.se
coreos01.kub.do.modio.se.zip

@Spindel
Copy link
Author

Spindel commented May 17, 2017

Sadly our monitoring doesn't have the proper cpu resources inside the cluster ( No proper process counts, etc) since they run as a daemonset pod.

A better way of running that would be nice.

Anyhow, I'm hitting the sack, it's way too late in my timezone. Hope the info helps!

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

@heyitsanthony

Strangely enough... the good member (coreos01.kub.do.modio.se:2379) also failed to send a few snapshots on the same day.

"MESSAGE" : "2017-05-17 15:08:20.434193 I | rafthttp: start to send database snapshot [index: 5790966, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:08:25.710143 I | rafthttp: start to send database snapshot [index: 5790995, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:08:35.739152 W | rafthttp: database snapshot [index: 5790966, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:08:35.741566 W | rafthttp: database snapshot [index: 5790995, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:11:01.644063 I | rafthttp: start to send database snapshot [index: 5791831, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:12:55.774031 W | rafthttp: database snapshot [index: 5791831, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:12:56.042411 I | rafthttp: start to send database snapshot [index: 5792449, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:13:56.535221 I | rafthttp: start to send database snapshot [index: 5792767, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:14:07.144550 W | rafthttp: database snapshot [index: 5792767, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:14:08.049084 I | rafthttp: start to send database snapshot [index: 5792829, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:14:18.592428 W | rafthttp: database snapshot [index: 5792829, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:14:18.853030 I | rafthttp: start to send database snapshot [index: 5792885, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:14:40.492684 W | rafthttp: database snapshot [index: 5792449, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:14:41.587673 I | rafthttp: start to send database snapshot [index: 5793010, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:15:02.155622 W | rafthttp: database snapshot [index: 5792885, to: 864e9ff97f31db34] failed to be sent out (write tcp [2a03:b0c0:3:d0::43bb:3001]:53990->[2a03:b0c0:3:d0::43bb:4001]:2380: write: connection reset by peer)"
"MESSAGE" : "2017-05-17 15:15:02.984109 I | rafthttp: start to send database snapshot [index: 5793123, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:15:38.859376 I | rafthttp: start to send database snapshot [index: 5793301, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:15:48.974015 W | rafthttp: database snapshot [index: 5793010, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:15:49.190946 I | rafthttp: start to send database snapshot [index: 5793357, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:16:10.765930 W | rafthttp: database snapshot [index: 5793123, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:16:11.682791 I | rafthttp: start to send database snapshot [index: 5793474, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:16:12.616413 W | rafthttp: database snapshot [index: 5793357, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:16:12.825122 I | rafthttp: start to send database snapshot [index: 5793480, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:16:54.197420 W | rafthttp: database snapshot [index: 5793474, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:16:54.335781 I | rafthttp: start to send database snapshot [index: 5793696, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:16:55.585107 W | rafthttp: database snapshot [index: 5793301, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:16:55.757656 I | rafthttp: start to send database snapshot [index: 5793702, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:17:52.729687 W | rafthttp: database snapshot [index: 5793696, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:17:58.294753 W | rafthttp: database snapshot [index: 5793480, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:18:20.395389 W | rafthttp: database snapshot [index: 5793702, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:19:58.508986 I | rafthttp: start to send database snapshot [index: 5794668, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:20:22.196328 I | rafthttp: start to send database snapshot [index: 5794790, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:20:31.653539 W | rafthttp: database snapshot [index: 5794668, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:20:32.068761 I | rafthttp: start to send database snapshot [index: 5794838, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:20:48.079427 W | rafthttp: database snapshot [index: 5794838, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:20:48.206504 I | rafthttp: start to send database snapshot [index: 5794941, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:20:53.611956 W | rafthttp: database snapshot [index: 5794790, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:20:53.720563 I | rafthttp: start to send database snapshot [index: 5794966, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:20:58.019918 W | rafthttp: database snapshot [index: 5794941, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:20:58.217332 I | rafthttp: start to send database snapshot [index: 5794995, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:21:31.398398 W | rafthttp: database snapshot [index: 5794966, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:21:32.911986 I | rafthttp: start to send database snapshot [index: 5795176, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:21:36.629722 W | rafthttp: database snapshot [index: 5794995, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:21:37.447187 I | rafthttp: start to send database snapshot [index: 5795203, to 864e9ff97f31db34]..."
"MESSAGE" : "2017-05-17 15:21:58.303013 W | rafthttp: database snapshot [index: 5795176, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"
"MESSAGE" : "2017-05-17 15:22:29.421087 W | rafthttp: database snapshot [index: 5795203, to: 864e9ff97f31db34] failed to be sent out (unexpected http status Internal Server Error while posting to \"https://coreos02.kub.do.modio.se:2380/raft/snapshot\")"

@Spindel
Copy link
Author

Spindel commented May 17, 2017

Note that before I did the status and such, I had manually compacted coreos01.kub.do.modio.se as it was also quite huge.

I'm not sure it should be considered a "good" node in this case, just that I didn't keep all the logs from that part.

@xiang90
Copy link
Contributor

xiang90 commented May 17, 2017

@Spindel Ah. Right. I looked into the log and found coreos01.kub.do.modio.se also has a HUGE snapshot at some point. So the snapshot failure theory is still possible.

@Spindel
Copy link
Author

Spindel commented May 18, 2017

I'd think it's very plausible.

To reproduce, you can probably start a three node cluster where one of them has < 500 MiB RAM, and then just put some various memory hogs on the low-RAM device.
Not a single big ram-hog, cause then the oom-killer would reap it, you want to create memory pressure, not killing things

@wojtek-t
Copy link
Contributor

/subscribe

@xiang90
Copy link
Contributor

xiang90 commented May 22, 2017

@Spindel Does the db size increase again?

@Spindel
Copy link
Author

Spindel commented May 23, 2017

Since I've permanently removed the suspected problematic host from the cluster, it doesn't appear to grow inconsiderately anymore:

endpoint status gives the following:

coreos03.kub.do.modio.se:2379, aea30ab9e1e7736e, 3.0.10, 5.6 MB, false, 8278, 7713102
coreos01.kub.modio.se:2379, 4f4fe1c600958885, 3.0.10, 5.6 MB, false, 8278, 7713102
coreos01.kub.do.modio.se:2379, 898e7e618204ed52, 3.0.10, 5.7 MB, false, 8278, 7713102
coreos05.kub.modio.se:2379, c1bd63d6fb4a10ea, 3.0.10, 5.7 MB, true, 8278, 7713104

Which still appears to be more than 100x bloat factor, is acceptable.

@xiang90
Copy link
Contributor

xiang90 commented May 23, 2017

@Spindel

I discussed with @heyitsanthony about this issue yesterday. We can understand snapshot issues contribute to the huge db size, but should not bring it up to 1GB+ in that short amount of time.

So I investigated into the logs again, and found the root cause for this case.

"MESSAGE" : "2017-05-14 16:07:28.319813 I | mvcc: finished scheduled compaction at 1815026 (took 1.573214ms)"
"MESSAGE" : "2017-05-14 16:12:28.454798 I | mvcc: finished scheduled compaction at 1815435 (took 912.595µs)"
"MESSAGE" : "2017-05-14 16:17:28.522506 I | mvcc: finished scheduled compaction at 1815827 (took 1.691996ms)"
"MESSAGE" : "2017-05-14 16:22:28.571924 I | mvcc: finished scheduled compaction at 1816216 (took 1.743819ms)"

// No compaction for 3 days with 30K+ entries ~= 30k*4kb ~= 1GB+ data.

"MESSAGE" : "2017-05-17 11:36:45.675060 I | mvcc: finished scheduled compaction at 2169745 (took 20.935680091s)"
"MESSAGE" : "2017-05-17 11:41:24.829663 I | mvcc: finished scheduled compaction at 2170257 (took 33.872468ms)"
"MESSAGE" : "2017-05-17 11:46:24.824949 I | mvcc: finished scheduled compaction at 2170744 (took 2.118377ms)"
"MESSAGE" : "2017-05-17 11:51:24.838707 I | mvcc: finished scheduled compaction at 2171255 (took 1.376061ms)"
"MESSAGE" : "2017-05-17 11:56:24.856919 I | mvcc: finished scheduled compaction at 2171767 (took 2.451828ms)"
"MESSAGE" : "2017-05-17 12:01:24.896190 I | mvcc: finished scheduled compaction at 2172276 (took 1.796448ms)"
"MESSAGE" : "2017-05-17 12:06:24.967299 I | mvcc: finished scheduled compaction at 2172788 (took 1.330718ms)"
"MESSAGE" : "2017-05-17 12:11:25.009440 I | mvcc: finished scheduled compaction at 2173296 (took 2.343877ms)"

The compaction somehow got stuck for 3 days. Same thing happens to all other members in the cluster. Kuberentes master should compact etcd every 5 minutes or so.

Can you check your api server log to see if anything happened to it?

@Spindel
Copy link
Author

Spindel commented May 23, 2017 via email

@xiang90
Copy link
Contributor

xiang90 commented May 31, 2017

@Spindel

Let's track the issue here: #7944.

I am going to create an issue on k8s repo about the missing compactions (which i believe is the main cause of your db growth).

@xiang90 xiang90 closed this as completed May 31, 2017
@Spindel
Copy link
Author

Spindel commented May 31, 2017

@xiang90 Uh. You just linked issue #7944 to issue #7944 and closed #7944...

Did you mean #8009?

@xiang90
Copy link
Contributor

xiang90 commented May 31, 2017

@Spindel sorry. yea. #8009

@Spindel
Copy link
Author

Spindel commented May 31, 2017

ack. subscribing!

@xiang90
Copy link
Contributor

xiang90 commented May 31, 2017

@Spindel if you ever met the same problem again, please do comment on #8009. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants