Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when requesting via xinetd+curl, "Recv failure" #2

Open
ceejayoz opened this issue Mar 21, 2013 · 9 comments
Open

when requesting via xinetd+curl, "Recv failure" #2

ceejayoz opened this issue Mar 21, 2013 · 9 comments

Comments

@ceejayoz
Copy link

$ curl localhost:9200
Percona XtraDB Cluster Node is synced.

curl: (56) Recv failure: Connection reset by peer

This breaks Amazon ELB, as it sees a 200 response of this nature as a failure.

I tweaked the script to add a Content-Length: 0 header, which appears to make Amazon happy, but I'm not entirely clear on the implications of this, or if there's a better way.

olafz added a commit that referenced this issue Mar 21, 2013
@olafz
Copy link
Owner

olafz commented Mar 21, 2013

Adding Content-Length: 0 is not really a solution, since a client will ignore the content. I modified the script in such a way that it reports the content length correctly (and as such, curl exits gracefully).

@olafz olafz closed this as completed Mar 21, 2013
@bradbakerdx
Copy link

I'm experiencing the same issue myself and my /usr/bin/clustercheck contains

echo -en "Content-Length: 40\r\n"
And
echo -en "Content-Length: 44\r\n"

Depending on if its a success or failure. In my instance setting echo -en "Content-Length: 0\r\n" did not help.

See more details here: http://serverfault.com/questions/504756/curl-failure-when-receiving-data-from-peer-using-percona-xtradb-cluster-check

I should clarify when I say "the same issue" - I get the same error when I use CURL to hit clustercheck.

Oddly it only happens when I hit cluster check remotely - hitting it locally seems to work.

In my case I'm using hardware load balancers not AWS load balancers.

@bradbakerdx
Copy link

Here is a packet capture containing some successes and some failures: https://www.dropbox.com/s/u2b9asn1p5vyh0r/data.pcap

In the case where there is a success there is an HTTP payload but when it fails there isn't an http payload.

image

@lucalvr
Copy link

lucalvr commented Dec 18, 2013

I have exactly the same issue

@homeyjd
Copy link

homeyjd commented Apr 29, 2014

I have exactly the same issue.

@bradbakerdx
Copy link

If it helps anyone, here's the solution we ended up using (its not pretty but its been working for us for about a year):

#!/bin/bash
#
# Script to make a proxy (ie HAProxy) capable of monitoring Percona XtraDB Cluster nodes properly
#
# Author: Olaf van Zandwijk 
# Documentation and download: https://github.com/olafz/percona-clustercheck
#
# Based on the original script from Unai Rodriguez
# Modified by Brad Baker 5/7/2013
#
# This cluster check script is provided by the percona packages under
# /usr/bin/clustercheck. I've made a copy of it to /our-custom-location because I had
# to customize it to get it to work reliably  and I don't want YUM overwriting
# our customized version.
#
# For some reason the percona provided version of this script will
# intermittently fail when accessed remotely using curl or our load balancer
# health check. To test this for yourself remotely run the following command
# for i in {1..1000}; do curl http://your-server:9200; sleep 2; date;  done
#
# After extensive debugging one of the Percona devs had me add sleep statements.  
# After doing so the intermittent issue stopped - WHY?! I have no idea. 
# But with those in place it works reliably. 
if [[ $1 == '-h' || $1 == '--help' ]];then
    echo "Usage: $0    "
    exit
fi
MYSQL_USERNAME="${1:-clustercheckuser}"
MYSQL_PASSWORD="${2:-clustercheckpassword!}"
AVAILABLE_WHEN_DONOR=${3:-0}
ERR_FILE="${4:-/dev/null}"
#Timeout exists for instances where mysqld may be hung
TIMEOUT=10
#
# Perform the query to check the wsrep_local_state
#
WSREP_STATUS=`mysql -nNE --connect-timeout=$TIMEOUT --user=${MYSQL_USERNAME} --password=${MYSQL_PASSWORD} \
-e "SHOW STATUS LIKE 'wsrep_local_state';" 2>${ERR_FILE} | tail -1 2>>${ERR_FILE}`
if [[ "${WSREP_STATUS}" == "4" ]] || [[ "${WSREP_STATUS}" == "2" && ${AVAILABLE_WHEN_DONOR} == 1 ]]
then
    # Percona XtraDB Cluster node local state is 'Synced' => return HTTP 200
    # Shell return-code is 0
    echo -en "HTTP/1.1 200 OK\r\n"
    sleep 0.1
    echo -en "Content-Type: text/plain\r\n"
    sleep 0.1
    echo -en "Connection: close\r\n"
    sleep 0.1
    echo -en "Content-Length: 40\r\n"
    sleep 0.1
    echo -en "\r\n"
    sleep 0.1
    echo -en "Percona XtraDB Cluster Node is synced.\r\n"
    sleep 0.1
    exit 0
else
    # Percona XtraDB Cluster node local state is not 'Synced' => return HTTP 503
    # Shell return-code is 1
    echo -en "HTTP/1.1 503 Service Unavailable\r\n"
    sleep 0.1
    echo -en "Content-Type: text/plain\r\n"
    sleep 0.1
    echo -en "Connection: close\r\n"
    sleep 0.1
    echo -en "Content-Length: 44\r\n"
    sleep 0.1
    echo -en "\r\n"
    sleep 0.1
    echo -en "Percona XtraDB Cluster Node is not synced.\r\n"
    exit 1
fi

@leoleovich
Copy link

Hello my dear friends.
Today I ran into the same problem.
I spent some time to figure out what costs this issue, so let me explain why it fails (sleeps do not really help):

  1. When curl/browser/keepalived... any proper client is asking for GET / HTTP/1.1 it actually expects you to respect http protocol. This requires, actually, read headers and body from the client.
    In realization you implemented do not read anything from client. You just send reply to him.
    This magically works for haproxy only because haproxy also completely ignores http protocol and also sends only GET / HTTP/1.0 without headers. Or with some configuration, send header, but they are always shorter than reply from sh script. This gives you a chance that generation of a reply will take a bit longer than sending this one line.

So why sleeps did not help for every client?

  1. Another "good" thing is - RST flag. After you do exit 0 xinetd immediately resets connection without proper finishing it. This makes no problem for browser or curl, but makes completely crazy C++ bufferevent_socket_connect for example, which expects to properly close connection.

Anyway, the solution is very easy - eather you properly read http headers from stdin, wait for \r\n and only then send the result with real Content-Length, or you stop using retarded xinetd (if you open the manual of xinetd it says REUSE flag is depricated) and use http server + mysql connector which you can easily write in any language within 2 hours. I did this - https://github.com/innogames/galeraht

I hope it helps to people like I, who experienced the same problem.

@fspv
Copy link

fspv commented Jul 24, 2016

Got to this ticket from google. Here is one more solution. We are looking for \r in input and only after it returning responce.

#!/bin/bash

while read line
do
  test "$line" = $'\r' && break
done

/bin/echo "HTTP/1.1 200 OK"
/bin/echo "Content-Type: text/plain"
/bin/echo "Connection: close"
/bin/echo "Content-Length: 3"
/bin/echo ""
/bin/echo "OK"

@dgeo
Copy link

dgeo commented Jun 10, 2022

just use #18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants