Skip to content

Conversation

@m-horky
Copy link
Contributor

@m-horky m-horky commented Jan 6, 2023

When IPv6 is disabled, this code crashed Python with segfault, instead of rising an exception. This patch iterates over interfaces found by generic 'getaddrinfo()' call instead of querying v4 and v6 individually.

@m-horky m-horky requested review from jirihnidek and ptoscano January 6, 2023 12:46
@cnsnyder cnsnyder requested a review from a team January 6, 2023 12:46
@m-horky m-horky marked this pull request as ready for review January 19, 2023 16:25
Copy link
Contributor

@jirihnidek jirihnidek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I miss some unit tests for this case and instructions for disabling/enabling IPv6 in the commit message or comment of PR. Something like:

# Disable IPv6
sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1
sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1
# Enable IPv6 again
sudo sysctl -w net.ipv6.conf.all.disable_ipv6=0
sudo sysctl -w net.ipv6.conf.default.disable_ipv6=0

And I have some other requests. More details in comments.

if ipv6_addresses:
net_info["network.ipv6_address"] = ", ".join(ipv6_addresses)
else:
log.debug(f"Error getting IPv6 addresses of {hostname} using socket inspection.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not scare users with any message containing word: "Error"

net_info["network.ipv6_address"] = ", ".join(ipv6_addresses)
else:
log.debug(f"Error getting IPv6 addresses of {hostname} using socket inspection.")
net_info["network.ipv6_address"] = ", ".join(self._get_ipv6_addr_list())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When IPv6 is completely disabled, then the list of addresses still contains ::1 address, which is not correct. When IPv6 is disabled, then there should not be any IPv6 address in the list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't reproduce this. When I disable IPv6, no v6 addresses are listed, on both bare metal and in container.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is strange. I can see: network.ipv6_address: ::1 in system facts, because of this:

https://github.com/candlepin/subscription-manager/blob/main/src/rhsmlib/facts/hwprobe.py#L747

It is strange, because loopback interface contains only IPv4 addresses:

net.interface.lo.ipv4_address: 127.0.0.1
net.interface.lo.ipv4_address_list: 127.0.0.1
net.interface.lo.ipv4_broadcast: Unknown
net.interface.lo.ipv4_broadcast_list: Unknown
net.interface.lo.ipv4_netmask: 8
net.interface.lo.ipv4_netmask_list: 8

It was probably me, who has implemented that, but I think that it is wrong now, because we should not fabricate any non-existent facts in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could fix it in other PR. It is only little bit related to this case. When I was talking about unit testing, then I had this network.ipv6_address: ::1 in my mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered it as well, the two paths could produce different facts (one with loopback, one without). I'd like to fix this later, either in following PR or in the upcoming re-implementation using ip.

When IPv6 is disabled, this code crashed Python with segfault, instead
of rising an exception. This patch iterates over interfaces found by
generic 'getaddrinfo()' call instead of querying v4 and v6 individually.

To test this, you can disable IPv6 by calling

 # sysctl -w net.ipv6.conf.all.disable_ipv6=1
 # sysctl -w net.ipv6.conf.default.disable_ipv6=1

(and after testing, enable it by passing 0s instead of 1s), or, when
using containers, add
 
 --sysctl net.ipv6.conf.all.disable_ipv6=1

to podman/docker arguments.
@m-horky
Copy link
Contributor Author

m-horky commented Feb 2, 2023

I'm not sure how to unittest this. This PR targets some weird not-easily-reproducible issues we had that were causing Python to crash instead of raising OSError; you can't mock that. This is rather a workaround: proper solution is to switch all of this for ip based collector which is being created already. The behavior stayed exactly the same as before, the only thing that changed is when the addresses are collected. Let me know if you have some specific unit test in mind that would help.

@m-horky m-horky force-pushed the mhorky/ipv6-address-collection branch from 2e5e5f9 to 112880f Compare February 2, 2023 15:10
@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Coverage

Coverage (computed on Fedora latest) •
FileStmtsMissCoverMissing
rhsmlib/facts
   hwprobe.py54913575%39–41, 52, 94, 205–207, 231–232, 248–249, 271, 276, 281–283, 285–288, 290, 335–339, 341–343, 346–349, 351–353, 380, 397, 416, 418, 450, 476, 594, 612–614, 623, 631–634, 644, 647, 649–650, 652, 659, 663–665, 671–673, 677–679, 690–691, 695, 724–732, 745–746, 772–774, 784, 793, 802–803, 806, 811–812, 845, 911, 917–919, 921–923, 932–933, 951, 953–954, 956, 958, 960, 962–965, 969–971, 973–974, 977, 987–991, 993–994, 996–997, 999–1005, 1007–1008
TOTAL17899434075% 

Tests Skipped Failures Errors Time
2620 6 💤 0 ❌ 0 🔥 54.653s ⏱️

@ptoscano
Copy link
Contributor

ptoscano commented Feb 6, 2023

TL;DR: I don't think we should include this patch.

I did some testing of this patch, and I noticed the log message Could not obtain IPv6 addresses using socket inspection. Using ethtool instead., whereas its "current equivalent" (Error during resolving IPv6 address of hostname: ...) is not shown. Digging a bit more, I found why, and it's due to how getaddrinfo() works. The initial getaddrinfo() (the one used to find out the FQDN) is called with socket.AF_UNSPEC as the family type; according to the documentation [1]:

The value AF_UNSPEC indicates that getaddrinfo() should return socket addresses for any address family (either IPv4 or IPv6, for example) that can be used with node and service.

The "either ... or" part is the key here: on a system that has both IPv4 and IPv6 set up, IPv4 is found as usable, and thus all the results of getaddrinfo() are only socket.AF_INET results. This means that, in practically all the setups, all the IPv4 addresses will be returned by the inital getaddrinfo(), and the ethtool "fallback" will be used for getting the IPv6 addresses (if available).

Hence, I don't think this change makes a lot of difference in terms of "code flow". The only problem here is Python 3.9 crashing, and I think it was fixed in stable branches of Python 3.10+ with python/cpython#101252 -- I will need to check that out. Also, I'm surprised nobody actually had/reported issues when registering a RHEL 9 system without IPv6...

[1] https://man7.org/linux/man-pages/man3/getaddrinfo.3.html

@m-horky
Copy link
Contributor Author

m-horky commented Feb 6, 2023

As a result of the problems, I'm closing this PR in favor of future re-implementation.

@m-horky m-horky closed this Feb 6, 2023
@m-horky m-horky deleted the mhorky/ipv6-address-collection branch February 6, 2023 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants