-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple connections being made/dropped using polling (express) #3
Comments
I wanted to note that I have a similar set up to #2 . And possibly similar errors. I am running a single EC2 instance behind an ALB. I run this behind some other managed services so I will have to see if NLB is an option. The thing I noted with that though is that his socket and master was running on a different port than the workers. If I set it up to run master and client on the same port (I tried it a while ago) I get an address in use warning. Any guidance on how to get that configured if that is possibly the error (basically I am not listening on master and hence the "stickiness" is not even given the chance to work?) @sagarjoshi-sct (sorry to tag you.) |
I did a bit more logging/digging around and tried to run the system (web side) with polling turned on and off on the client. When polling was turned off the logs were mostly un-notable. After about 9 minutes of idle I got a ping time out which might be interesting, or not.
It appears though that the polling version is getting a number 400 errors.
and
I checked my cors configuration for the server (I don't explicitly state anything for the socket.io side since all the requests should be coming for my domain... in theory...) But Seemed to be OK. I will dig into other reasons behind the 400 errors. But if there is anything that I am obviously doing wrong please let me know. Thanks again! |
You shouldn't call The HTTP server of the master process handles all the incoming requests: Lines 59 to 74 in a367d1b
And then forwards the handle to the right worker, which manually inject the connection in its own HTTP server (which is not actually listening): Lines 102 to 110 in a367d1b
Usage with Express should be quite straightforward: const cluster = require("cluster");
const http = require("http");
const { Server } = require("socket.io");
const redisAdapter = require("socket.io-redis");
const numCPUs = require("os").cpus().length;
const { setupMaster, setupWorker } = require("@socket.io/sticky");
if (cluster.isMaster) {
console.log(`Master ${process.pid} is running`);
const httpServer = http.createServer();
setupMaster(httpServer, {
loadBalancingMethod: "least-connection", // either "random", "round-robin" or "least-connection"
});
httpServer.listen(3000);
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on("exit", (worker) => {
console.log(`Worker ${worker.process.pid} died`);
cluster.fork();
});
} else {
console.log(`Worker ${process.pid} started`);
- const httpServer = http.createServer();
+ const app = require("express")();
+ const httpServer = http.createServer(app);
const io = new Server(httpServer);
io.adapter(redisAdapter({ host: "localhost", port: 6379 }));
setupWorker(io);
io.on("connection", (socket) => {
console.log("connect", socket.id);
});
+ app.get("/test", (req, res) => {
+ res.send("OK");
+ });
} |
@darrachequesne, thank you so much that worked so much better and was seemingly much more stable. When Polling is on with the web client I am still seeing the 400 response (client side) and on the server side a rapid swapping of ping timeout or transport close.
Sometimes I get both. Sometimes the wss error is missing. Sometimes I just get the GET or the POST. Could there be something amiss with my connection configuration?
Client side
And I still call my socketManager code (which initializes the socket etc...) before calling setupWorker(io). I can separate them.
Thoughts. Again, I truly appreciate your answer above. |
Hmm, your configuration looks good to me. Unfortunately, I am unable to reproduce the HTTP 400 errors, I've created a running example: https://github.com/socketio/socket.io-fiddle/tree/sticky Could it be linked to your HTTPS setup? |
Ok, so I must have messed up something in my deployment so I am going to have to rescind my earlier statement about it working (not about my gratitude). I went to play around with the https setup this morning and started getting a 504 Gateway Timeout error. I have put a small (https) server with a few other modifications (had to use socket.io 2.1.1 syntax) and continued to get the same error (I did try the http version but my ALB blocks all traffic). So I tried your "exact" implementation. When I added in the https configuration I continued to get the Gateway Timeout error. If I switched it to http and ran curl on localhost I did get the proper response. Here is the http version.
However in this version I was getting the following error on my console:
I did not get this error in the https version. What I did see there (when I did the curl call locally) was that it had an issue with the self-signed certificate. If I did I wonder if the error is due to the 2.1.1. But the https issue is the bigger concern for me. |
And for what it is worth, the single server version with the https did work externally and did not get the Gateway Timeout error. And if I took out the "options" (the certificates) from the master createServer in the master/client version I got a Bad Gateway, which is not unexpected... but thought I would try. |
So, I have tried a variety of things. (just to make sure it was not a 2.1.1 issue I upgraded a version to 3.X) Whenever I do NOT have a listener in the worker thread (whether I am using sticky or not) I get this Timeout. But from your explanation above this is the way it should work, the master is the pass-through. But here is perhaps my mis-understanding and if so I apologize for wasting precious time. The listener is only for the socket connections. And for what ever reason I was presuming that it would work for the other app/express requests as well. So, if this is the case I apologize. (I am tired, overworked, and have a three year old who sleeps less than I do... but we could all say that I suppose. Well, maybe except for the 3 year old.) Any thoughts on a library/structure that would maintain stickiness on the socket for polling and allow for listening to the express traffic? |
Also, running the sticky fiddle locally under http alone seems to work just fine. (so my pass through theory was wrong...) |
Arf, I could indeed reproduce the issue with an HTTPS server: https://github.com/socketio/socket.io-fiddle/tree/sticky-https
Line 60 in a367d1b
Note: during my testing, I also encountered an exception with a plain HTTP server:
Let's dig into this... |
I ran into the "primoridials" issue I referenced above on a plain http server running on my AWS system. The phrasing and source was slightly different. I can see if I can recreate it again if you would like.
The https version I was using last night was created from a modified version of your sticky fiddle. Mine ended up looking almost identical to the version you just posted. I had my compliance and AWS asset group dig into my network configuration last night but they didn't un-cover anything notably wrong with the configuration. |
@darrachequesne, just thought I would check in with you on this. What can I do to assist? Thanks! |
According to this and this, one can't directly send a TLSSocket object from the master process to the worker process, but creating a basic net.Socket and piping the TLSSocket stream into it should work. I wasn't able to achieve it though: const fs = require("fs");
const { Socket } = require("net");
const httpsServer = require("https").createServer({
key: fs.readFileSync("./key.pem"),
cert: fs.readFileSync("./cert.pem"),
});
const httpServer = require("http").createServer((req, res) => {
// never called
});
httpsServer.on("secureConnection", (socket) => {
const plainSocket = new Socket();
socket.pipe(plainSocket);
httpServer.emit("connection", plainSocket);
});
httpsServer.listen(3000); |
It didn't work in my setup either. There did not seem to be any discernible difference in the outcome (Time out). Probably a very naive question but is "plainSocket" opened by default? (Sorry, I am a bit out of my element here.) |
Unfortunately, I was not able to find a solution to this. If anyone has a suggestion, please ping me. |
My apologies for the long delay on responding to this. Thanks @darrachequesne I appreciate you looking into it. Feel free to close the issue (or I can). |
Update: it seems we could reuse the logic there: https://github.com/coder/code-server/blob/main/src/node/socket.ts |
Hi there,
For starters, I am using Socket.io 2.1.1 (old, I know but it worked with React-Native).
I have a server that I am not sure is effectively displaying sticky behavior when polling is turned on. Here is a log entry from a single client:
process: 6680, 2021-06-01T15:48:02.941Z, socketManager, io.sockets.on(connect), Will broadcast HELLO JfLCgyTZmvMcpMgzAAFZ process: 6674, 2021-06-01T15:48:05.127Z, socketManager, io.sockets.on(connect), Will broadcast HELLO MCAoSsuLNuiGe2o3AAED process: 6680, 2021-06-01T15:48:06.989Z, socketManager, io.sockets.on(connect), Will broadcast HELLO LzxAW4_1Fhk3D4cdAAFa
The system seems to, on the initial connect seem to jump between different instances of the server (in this case the server running on process 6680 and 6674). My fear is that I do not have my server master/worker set up properly. So this is more of a question than an issue. I am using cluster and express to server dynamic and static routes. I do have a Redis client on the back end but that is receiving events properly. Here is the relevant parts of my server setup:
socketManager here encapsulates the socket connection creation and all of the events tied to it. (redisCache is simply a user management cache.
The module simply exports the socket configuration code:
The two things that I am wondering about in particular:
Cannot GET /
. When I shift the app.use content to be configured with the master the page comes up but I never see the socket connections being made and the DB connection times out even though it is initialized. (The "Hello" log from above are never issued.) The one thing missing from the socket.io site example is express.Is this a case of a mis-configuration on sequencing? I s there some way to shift my master/worker configuration around such that it is using the sticky functionality and uses express but can also get the worker instances to respond on a client connection?
Is my express/https server/socket io implementation misconfigured in some way? Is there a way to get express/https to listen at both the master and the worker level. It would seem to me that while I am connecting through the master it is not being passed to a worker.
Thanks Joe
The text was updated successfully, but these errors were encountered: