Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolution for sysrepo backed by MongoDB on Docker #15

Closed
roc-ops opened this issue Nov 20, 2024 · 12 comments
Closed

DNS resolution for sysrepo backed by MongoDB on Docker #15

roc-ops opened this issue Nov 20, 2024 · 12 comments

Comments

@roc-ops
Copy link

roc-ops commented Nov 20, 2024

I am trying to setup rousette on docker-compose with sysrepo backed by a Mongodb server on a separate container but rousette does not like dns resolution on docker

when starting rousette
(.venv) root@ee22530c790b:/build/rousette/yang# rousette
[2024-11-20 22:39:14.591] [rousette] [debug] NACM config validation: no rule-list entries
[2024-11-20 22:39:14.592] [rousette] [info] NACM config validation: Anonymous user access disabled
[2024-11-20 22:39:14.594] [rousette] [warning] Telemetry disabled. No CzechLight YANG modules found.
terminate called after throwing an instance of 'std::runtime_error'
what(): Server error: Host not found (authoritative)
Aborted (core dumped)

if I change my DNS server to 8.8.8.8 (or any public)
(.venv) root@ee22530c790b:/build/rousette/yang# rousette
2024/11/20 22:40:51.0414: [12045]: DEBUG: monitor: [mongo:27017] command or network error occurred: Failed to resolve mongo

mongo is the name of my mongodb service in my docker compose file
it is reachable via ICMP
(.venv) root@ee22530c790b:/build/rousette/yang# ping mongo
PING mongo (172.28.0.2) 56(84) bytes of data.
64 bytes from sysrepo-mongo.sysrepo-mongo_mongo (172.28.0.2): icmp_seq=1 ttl=64 time=0.150 ms

however a nslookup shows that docker's DNS is giving a non-authoritive answer
(.venv) root@ee22530c790b:/build/rousette/yang# nslookup mongo
Server: 127.0.0.11
Address: 127.0.0.11#53

Non-authoritative answer:
Name: mongo
Address: 172.28.0.2

The Netopeer2 works on same machine, so I know sysrepo is working properly

it seems the DNS resolution for sysrepo mongo instance, in rousette, is only accepting authoritive DNS responses via Boost

if I add mongo to /etc/hosts, and set resolve.conf to 8.8.8.8 I can ping mongo but still same error when starting rousette

(.venv) root@ee22530c790b:/build/rousette/yang# ping mongo
PING mongo (172.28.0.2) 56(84) bytes of data.
64 bytes from mongo (172.28.0.2): icmp_seq=1 ttl=64 time=0.156 ms
64 bytes from mongo (172.28.0.2): icmp_seq=2 ttl=64 time=0.114 ms
^C
--- mongo ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1013ms
rtt min/avg/max/mdev = 0.114/0.135/0.156/0.021 ms
(.venv) root@ee22530c790b:/build/rousette/yang# nslookup mongo
Server: 8.8.8.8
Address: 8.8.8.8#53

** server can't find mongo: NXDOMAIN

(.venv) root@ee22530c790b:/build/rousette/yang# rousette
[2024-11-20 22:48:02.452] [rousette] [debug] NACM config validation: no rule-list entries
[2024-11-20 22:48:02.452] [rousette] [info] NACM config validation: Anonymous user access disabled
[2024-11-20 22:48:02.457] [rousette] [warning] Telemetry disabled. No CzechLight YANG modules found.
terminate called after throwing an instance of 'std::runtime_error'
what(): Server error: Host not found (authoritative)
Aborted (core dumped)

@jktjkt
Copy link
Contributor

jktjkt commented Nov 20, 2024

Could you please attach a full backtrace when that exception is thrown?

@roc-ops
Copy link
Author

roc-ops commented Nov 20, 2024

Not too familiar with generating backtraces used instructions from here: https://wiki.ubuntu.com/Backtrace
If you need something different let me know.
gdb-rousette.txt

@jktjkt
Copy link
Contributor

jktjkt commented Nov 21, 2024

Thanks. Unfortunately, the backtrace only says that it's "somewhere from the constructor". There's a lot going on in there, and I'm missing some more detailed info. There's nothing directly in rousette which would attempt to resolve hostnames, but there's plenty of bugreports around the web which mention Docker's network setup, problems with DNS resolution and this very same error message. Also, the SW stack that we're using, which is based on Boost-ASIO, might be relevant here.

The Netopeer2 works on same machine, so I know sysrepo is working properly

We don't know that yet. Rousette is creating a different set of internal subscriptions. It is "possible" that some of these subscriptions refer to a module whose datastore uses the MongoDB plugin, which attempts to resolve stuff, which breaks under your specific network setup.

Can you please ensure that you've built rousette with -ggdb, run it under gdb interactively (e.g., gdb /path/to/rousette), set a breakpoint (via break __cxa_throw when in GDB), then run, then backtrace (or thread apply all backtrace)?

@roc-ops
Copy link
Author

roc-ops commented Nov 21, 2024

What I was implying with the Netopeer comment was that sysrepo and netopeer2 seem to be both connecting to the datastores backed by mongoDb on another container (running, operational in my case) and startup in non-mongoDb (json)

the -ggdb flag did not work with cmake, a little googling and I found this:
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
which is how I built it

gdb-rousette2.txt
Here is the backtrace both full and as requested.

Since the error seems to occur in sysrepo-cpp I also enabled debug symbols on that package and ran another gdb backtrace not sure if that helps or not
gdb-rousette3.txt

@jktjkt
Copy link
Contributor

jktjkt commented Nov 21, 2024

Thanks, and just confirming that the CMake flags are correct; that should get you going.

The error message that's shown now is a different one, you're getting a SR_ERR_OPERATION_FAILED when establishing the initial connection (or starting a sysrepo session; the C++ bindings use a wrong error message). Are you sure your environment has not changed? If you run it without gdb (but the exact same build, etc), what error message do you get now?

@roc-ops
Copy link
Author

roc-ops commented Nov 21, 2024

gdb-rousette4.txt
Sorry, when I restarted docker some preliminary commands did not get run now it is the same error

@jktjkt
Copy link
Contributor

jktjkt commented Nov 21, 2024

Sorry for yet another round of instructions, there are some internal exceptions which are usually ignored. Do something like ignore 1 4, or just continue four times when the exceptions are thrown. The first four exceptions are "harmless" and are handled immediately.

@roc-ops
Copy link
Author

roc-ops commented Nov 21, 2024

gdb-rousette5.txt
Ok here you go.

@jktjkt
Copy link
Contributor

jktjkt commented Nov 21, 2024

That was continue entered five times. You need four I'm afraid.

Also, consider picking this patch and running rousette with --sysrepo-log-level=3 for extra debug messages. If this is from a sysrepo plugin (which I suspect it is), this might help shed some light into the specific details.

@roc-ops
Copy link
Author

roc-ops commented Nov 21, 2024

Here is four continues before patch
gdb-rousette6.txt
and four continues after patch
gdb-rousette7.txt

jktjkt added a commit to sysrepo/sysrepo-cpp that referenced this issue Nov 25, 2024
Thanks to Jason Patterson for sending a bugreport; its investigation led
to discovering this bug.

See-also: CESNET/rousette#15
Change-Id: I082987d60380f860e08f26b436d0f9db4a231bec
@jktjkt
Copy link
Contributor

jktjkt commented Nov 25, 2024

We looked into this with @peckato1 , and we were able to reproduce this even without MongoDB when the IPv6 network stack was completely disabled. You mentioned Docker and, unfortunately, it seems that they still disable IPv6 by default. Could you please check whether the daemon starts up properly when you enable IPv6? You don't need any particular settings, just a ::1/128 on the lo interface is sufficient.

Alternatively, you might try patching the server like this:

diff --git a/src/restconf/main.cpp b/src/restconf/main.cpp
index 0c31f08..b8c6e12 100644
--- a/src/restconf/main.cpp
+++ b/src/restconf/main.cpp
@@ -160,7 +160,7 @@ int main(int argc, char* argv [])
     }
 
     auto conn = sysrepo::Connection{};
-    auto server = rousette::restconf::Server{conn, "::1", "10080", timeout};
+    auto server = rousette::restconf::Server{conn, "127.0.0.1", "10080", timeout};
     signal(SIGTERM, [](int) {});
     signal(SIGINT, [](int) {});
     pause();

Please be advised that we do require IPv6; the service is designed to run behind a reverse proxy, and we might rely on IPv6-features for a proper, secure TLS setup in future.

For those reading this bugreport in future: the suggestion for hardcoding 127.0.0.1 is meant only as a temporary step during debugging to verify the root cause of this particular issue.

@jktjkt
Copy link
Contributor

jktjkt commented Dec 18, 2024

No reply, closing.

@jktjkt jktjkt closed this as not planned Won't fix, can't repro, duplicate, stale Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants