Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node/Docker: failed to bind listening socket #2985

Open
davefojtik opened this issue Jan 21, 2025 · 8 comments
Open

Node/Docker: failed to bind listening socket #2985

davefojtik opened this issue Jan 21, 2025 · 8 comments
Assignees
Labels
bug Something isn't working Rust core redis-rs/glide-core matter User issue Issue openned by users
Milestone

Comments

@davefojtik
Copy link

davefojtik commented Jan 21, 2025

Describe the bug

I encountered issues when using the ValkeyDB client on Docker.

The client fails to connect to the Valkey database, even though the database container is accessible via other methods (e.g., successful ping, open port, and iovalkey connecting without issues).

Despite hours of troubleshooting, I could not resolve these problems.

Expected Behavior

The package should install via npm and function as described in the documentation. It should allow successful connections to the Valkey database.

Current Behavior

  • The client fails to connect to the database, while other methods (e.g., iovalkey) work as expected.
    Edit: This doesn't happen 1st time after container build, but every other attempt.

Error message:

listen_on_socket - Error failed to bind listening socket: Address in use (os error 98)

Reproduction Steps

  • Create a basic Node.js app with the dependency @valkey/[email protected], using the node:lts-alpine Docker image.
  • Load the package using:
import { GlideClient } from "@valkey/valkey-glide";
  • Attempt to connect to the database following the examples in the documentation.

Possible Solution

No response

Additional Information/Context

No response

Client version used

1.2.1

Engine type and version

Valkey 8.0.2

OS

Alpine running in Docker Desktop

Language

JavaScript

Language Version

node:lts-alpine

Other information

I appreciate the project’s vision and features—it’s clear there’s significant potential here. However, I’m concerned that the "one client for all languages" approach may create challenges in ensuring basic compatibility, especially for common setups like Docker.

@davefojtik davefojtik added the bug Something isn't working label Jan 21, 2025
@avifenesh avifenesh added Users Pain An issue known to cause users pain, generaly open by the user. Unatriaged user issue Issue open by user and wasn't triaged yet User issue Issue openned by users labels Jan 21, 2025
@avifenesh
Copy link
Collaborator

avifenesh commented Jan 21, 2025

Hi @davefojtik :)
Thanks for reaching out!

We have CI/CD testing all permutations, including the one you mentioned. We are testing LTS on alpine as well, and this is the reason we mark it as supported.

I will reproduce the issue and help.

Generally, I'm using myself for a side project Glide on alpine LTS and don't have issues, so I might need additional information to make sure I reproduce the same scenario.

Can you please share the Dockerfile, compose and a bit longer code snippets?

Based on the error message, it appears that the problem is with the socket being occupied by something, so to understand I need a little more code.

I appreciate the project’s vision and features—it’s clear there’s significant potential here. However, I’m concerned that the "one client for all languages" approach may create challenges in ensuring basic compatibility, especially for common setups like Docker.

You're right with your concerns, but have a massive CI testing for all permutation, including all platforms, dockers as well, languages versions, and valkey versions. We announce full support only for what we fully test.

@avifenesh avifenesh self-assigned this Jan 21, 2025
@avifenesh avifenesh added node Node.js wrapper and removed Unatriaged user issue Issue open by user and wasn't triaged yet labels Jan 21, 2025
@davefojtik
Copy link
Author

As I mentioned, it's happening even in the most basic setup. But here it is for clarity:

compose.yml

services:
  valkeydb:
    build: Valkey
    container_name: valkeydb
    ports:
      - "6379:6379"
  test:
    build: Test
    container_name: test
    depends_on:
      - valkeydb
    ports:
      - "3000:3000"

/Valkey
standard valkey/valkey:8-alpine container, just with json module and acl user if that matters, tested working both with iovalkey and glide first-time launch pings

/Test
package.json

{
	"name": "test",
	"version": "1",
	"dependencies": {
		"@valkey/valkey-glide": "1.2.1"
	},
        "type": "module"
}

Dockerfile

FROM node:20-alpine
ENV NODE_ENV=production

COPY package.json .
RUN npm install

WORKDIR /main
COPY . .

EXPOSE 3000
CMD ["node", "app.js"]

app.js

import { GlideClient } from "@valkey/valkey-glide";

const addresses = [{ host: "valkeydb", port: 6379 }];
const credentials = { username: "worker", password: "password" };

const client = await GlideClient.createClient({
    addresses: addresses,
    credentials: credentials
});

const pong = await client.customCommand(["PING"]);
console.log(pong);

This works just the first time you build the image, then stopping and starting the nodejs container and trying to connect again results in the listen_on_socket - Error failed to bind listening socket: Address in use (os error 98). Restarting the Valkey container or the whole docker engine doesn't help either.

@avifenesh
Copy link
Collaborator

avifenesh commented Jan 22, 2025

@davefojtik I think i understand the issue but need to validate it.

Since we use UDS to communicate, and there's no close and clean up of the process, nor crash that trigger clean up, and the docker doesn't have a volume cleanup, I suspect that the uds file is not being cleaned from the file system, hence the os think the address is occupied.
We don't close the UDS connection during running program since obviously we want the connection to stay up until exit.

The node program stays up if there's no exit, as your code is not running inside main or similar with implicit end, but as a "top level" code so no signal to close and clean the process.
And for the docker - stopping a docker doesn't clean its volume, and im not sure its clean processes gracefully in the sense of validation of cleaning UDS files, but about the last I'm really assuming, im not sure.

It's late here so I'll validate it tomorrow, it's just a hunch.

Meanwhile if you want to solve it yourself, or help getting to conclusions, if my hunch is right each of the below additional steps should solve the issue.
Can you please try to add client.close() at the end of the program.
Or to run the docker with --rm.
Or end the node program with process.exit(0).
Better separately so i can be sure, or if not, to get better direction.

We should think about this scenario and solution anyway, and find a way to better handle it, but as a temporary solution, and triage.

@Yury-Fridlyand Yury-Fridlyand mentioned this issue Jan 22, 2025
6 tasks
@davefojtik
Copy link
Author

Sure. I want to use this client in the future, so I am determined to help you find a solution. Thank you for your time devoted to this problem.

Let's see:

  • client.close(); doesn't change the behaviour
  • removing the node container does connect it successfully on the next run
  • process.exit(0); doesn't change the behaviour

But based on your info, I troubleshooted a bit more and discovered a "glide-socket-1" file being stored in /tmp. Adding this entry point cleanup script into the Dockerfile solves it, and now the container connects successfully on each run:

ENTRYPOINT ["sh", "-c", "rm -f /tmp/glide-socket-1 && node app.js"]

So, I guess the solution could be the client removing this file on exit or when a user calls the client.close(). Perhaps also randomizing the name since I saw an issue with this error when using multiple instances. Or is there a reason for it to persist?

@davefojtik davefojtik changed the title Node/Docker: Package not working Node/Docker: failed to bind listening socket Jan 22, 2025
@avifenesh
Copy link
Collaborator

avifenesh commented Jan 22, 2025

Thanks for the help, appreciated! @davefojtik
Interesting.
The thing is that the client has no problem running time after time on the machines I work with, which are Ubuntu aarch, Ubuntu x86, Mac m2, Mac intel, Amazon Linux and Linux mint (which is Ubuntu-based).
I work with Alpine and slim as well, but I'm always clean them up after.
I have a slim docker used for a project which stays up, and gets code sent to, and executes it, each run is separate. Also, no issue.
There's no problem with running multiple clients, it was with Mac only, and we resolved it.
For some reason here, it can't, so it is an interesting case. Why is it special.

I think the solution is using abstract UDS files. Generally, it's a better approach, with or without this issue.
I need to check the implications.

I will also check if something similar is happening with other types of dockers or older versions of alpine.
When alpine released 3.20 it had an issue with python and created problems with node, we needed to downgrade to 3.19 for a while. But it was with node in general and was solved a week or two after. Maybe there's another issue in the new 3.21, I'll give it a try.

But just so I understand the urgency, since there's a workaround, it is not a blocker for you, right?

@davefojtik
Copy link
Author

davefojtik commented Jan 22, 2025

Yes, I see it's a problem with Docker and the specific environment. With the multiple clients, I meant this issue, which still seems open and planned for 1.4.

Currently, it's not a blocker for me. With this workaround, I can work with the client completely fine, which wouldn't be possible without the information about the UDS. Thank you very much for that. I can now focus back on the project, so this issue is basically solved and can be closed, but I'll leave that up to you and I am happy to provide more information or testing if needed.

@avifenesh
Copy link
Collaborator

@davefojtik
ah, you meant to this case, yes this is also a unique issue. They use many packages on the same machine.
We designed it in a way that the rust will not be created again and again if you use many clients, so we use the same file.
The rust core is a multiplexer, and working multithreaded, so there is no need to recreate the core process.
For some reason you might want more than one client, e.g. you are using blocking commands or so, but creating another core has no value.

In their case, they have many separate packages, which running unaware of each other as different programs and as such are trying to create the core.

We didn't think about this case, but since they got a workaround for now, we are planning it for 1.4 in our priority.

I will leave it open since even though you have it working, I still want to have it solved completely :)

Thanks for your help, and thanks for reaching out! It is very helpful.
If you need anything else, don't hesitate, and if you have any feedback at a later point, good or bad, please share!

@avifenesh avifenesh removed the Users Pain An issue known to cause users pain, generaly open by the user. label Jan 22, 2025
@avifenesh avifenesh added this to the 1.4 milestone Jan 22, 2025
@ikolomi ikolomi assigned ikolomi and unassigned ikolomi Jan 23, 2025
@eifrah-aws
Copy link
Contributor

@avifenesh the more better solution would be to unlink it in the code before we attempt to bind it (this is what apps are usually doing) as we can't rely on graceful exit of the process.

Adding the below line:

let _ = std::fs::remove_file(&socket_path_cloned);

Just before this line:

https://github.com/valkey-io/valkey-glide/blob/main/glide-core/src/socket_listener.rs#L865

should fix the issue

@eifrah-aws eifrah-aws added Rust core redis-rs/glide-core matter and removed node Node.js wrapper labels Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Rust core redis-rs/glide-core matter User issue Issue openned by users
Projects
Status: No status
Development

No branches or pull requests

4 participants