sometimes pod doesn't become ready. I think because of problem with .State attribute #947

Dieterbe · 2018-06-20T20:29:10Z

sometimes metrictank pods that use the kafka-mdm plugin remain not ready because they return 503 on /, despite having all of their backlog replayed

2018/06/20 19:53:15 [DEBUG] memberlist: Initiating push/pull sync with: 172.22.128.7:7946
[Macaron] 2018-06-20 19:53:18: Started GET / for 172.19.64.0
[Macaron] 2018-06-20 19:53:18: Completed / 503 Service Unavailable in 867.431µs
[Macaron] 2018-06-20 19:53:18: Started GET /node for 172.19.64.0
[Macaron] 2018-06-20 19:53:18: Completed /node 200 OK in 906.048µs
[Macaron] 2018-06-20 19:53:28: Started GET / for 172.19.64.0
[Macaron] 2018-06-20 19:53:28: Started GET /node for 172.19.64.0
[Macaron] 2018-06-20 19:53:28: Completed / 503 Service Unavailable in 809.084µs
[Macaron] 2018-06-20 19:53:28: Completed /node 200 OK in 854.825µs

however their priority is good.
metrictank.stats.$environment.$instance.cluster.self.priority.gauge32 is at 0
and :

curl http://localhost:6060/priority | jsonpp
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   490  100   490    0     0    703      0 --:--:-- --:--:-- --:--:--   702
[
    {
        "Title": "kafka-mdm:",
        "Explanation": {
            "Explanation": {
                "Status": {
                    "120": {
                        "Lag": 1,
                        "Rate": 1907,
                        "Priority": 0
                    },
                    "121": {
                        "Lag": 1,
                        "Rate": 1910,
                        "Priority": 0
                    },
                    "122": {
                        "Lag": 1,
                        "Rate": 1870,
                        "Priority": 0
                    },
                    "123": {
                        "Lag": 1,
                        "Rate": 1911,
                        "Priority": 0
                    },
                    "124": {
                        "Lag": 1,
                        "Rate": 1923,
                        "Priority": 0
                    },
                    "125": {
                        "Lag": 1,
                        "Rate": 1950,
                        "Priority": 0
                    },
                    "126": {
                        "Lag": 1,
                        "Rate": 2278,
                        "Priority": 0
                    },
                    "127": {
                        "Lag": 2,
                        "Rate": 2249,
                        "Priority": 0
                    }
                },
                "Priority": 0,
                "Updated": "2018-06-20T19:59:35.159761495Z"
            },
            "Now": "2018-06-20T19:59:40.430106864Z"
        }
    }
]

curl http://localhost:6060/node | jsonpp
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   436  100   436    0     0    698      0 --:--:-- --:--:-- --:--:--   697
{
    "name": "mt-read15-75291-large-ttd-b-553865555-b44b3",
    "version": "0.9.0-99-g0847450",
    "primary": false,
    "primaryChange": "2018-06-20T16:04:10.804909693Z",
    "state": "NodeNotReady",
    "priority": 0,
    "started": "2018-06-20T16:04:10.803162723Z",
    "stateChange": "2018-06-20T16:04:10.804909929Z",
    "partitions": [
        120,
        121,
        122,
        123,
        124,
        125,
        126,
        127
    ],
    "apiPort": 6060,
    "apiScheme": "http",
    "updated": "2018-06-20T16:43:25.157847604Z",
    "remoteAddr": "172.19.64.38"
}

curl -v http://localhost:6060/
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 6060 (#0)
> GET / HTTP/1.1
> Host: localhost:6060
> User-Agent: curl/7.60.0
> Accept: */*
> 
< HTTP/1.1 503 Service Unavailable
< Content-Type: text/plain
< Trace-Id: 6327f0bdd793cb50
< Vary: Origin
< Date: Wed, 20 Jun 2018 20:03:29 GMT
< Content-Length: 14
< 
* Connection #0 to host localhost left intact

curl doesn't show the body, but:

sudo ngrep -W byline -d any port 6060
interface: any
filter: ( port 6060 ) and (ip || ip6)
####
T ::1:45294 -> ::1:6060 [AP] #4
GET / HTTP/1.1.
User-Agent: Wget/1.19.5 (linux-gnu).
Accept: */*.
Accept-Encoding: identity.
Host: localhost:6060.
Connection: Keep-Alive.
.

##
T ::1:6060 -> ::1:45294 [AP] #6
HTTP/1.1 503 Service Unavailable.
Content-Type: text/plain.
Trace-Id: 88f036c9e9fc5a6.
Vary: Origin.
Date: Wed, 20 Jun 2018 20:13:55 GMT.
Content-Length: 14.
.
node not ready <---------------------------------------------------------

so that confirms we trigger the 503 in appStatus:

func (s *Server) appStatus(ctx *middleware.Context) {
	if cluster.Manager.IsReady() {
		ctx.PlainText(200, []byte("OK"))
		return
	}

	response.Write(ctx, response.NewError(http.StatusServiceUnavailable, "node not ready"))
}

which means this returns false:

func (n HTTPNode) IsReady() bool {
        return n.State == NodeReady && n.Priority <= maxPrio
}

as we can see above in the above curl http://localhost:6060/node output, the problem is.State

however, metrictank.stats.$environment.$instance.cluster.self.state.ready.gauge1 becomes 1 shortly after startup.
and from looking at the code, that metric (nodeReady) should always be in sync with the .State property on the node.
the one exception i can see is in MemberlistManager.NotifyUpdate(), an update may come in for a node and we'll update the state of that node, but if that node has the same name our nodeName, then that way we could set state to something else without updating the nodeReady metric.

The text was updated successfully, but these errors were encountered:

Dieterbe · 2018-06-21T11:33:20Z

@woodsaj since you're much more familiar with that code, can you think of scenarios where MemberlistManager.NotifyUpdate() may receive an update about itself (the local node) being non ready ? seems like a race condition where the node marks itself as ready but shortly afterwards gets a stale update about when it was not-ready

woodsaj · 2018-06-22T03:45:51Z

a race condition seems like the most likely issue. We can solve this easily enough by using the "updated" timestamp to drop updates that are older than the state we already have.

woodsaj mentioned this issue Jun 22, 2018

cluster: drop node updates that are old or about thisNode #948

Merged

Dieterbe closed this as completed in #948 Oct 29, 2018

Dieterbe added this to the 0.11.0 milestone Dec 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sometimes pod doesn't become ready. I think because of problem with .State attribute #947

sometimes pod doesn't become ready. I think because of problem with .State attribute #947

Dieterbe commented Jun 20, 2018

Dieterbe commented Jun 21, 2018

woodsaj commented Jun 22, 2018 •

edited

Loading

sometimes pod doesn't become ready. I think because of problem with .State attribute #947

sometimes pod doesn't become ready. I think because of problem with .State attribute #947

Comments

Dieterbe commented Jun 20, 2018

Dieterbe commented Jun 21, 2018

woodsaj commented Jun 22, 2018 • edited Loading

woodsaj commented Jun 22, 2018 •

edited

Loading