Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] shared_database2 spec test has inconsistent results in actions #2078

Closed
svteb opened this issue Jun 19, 2024 · 0 comments
Closed

[BUG] shared_database2 spec test has inconsistent results in actions #2078

svteb opened this issue Jun 19, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@svteb
Copy link
Collaborator

svteb commented Jun 19, 2024

Describe the bug
The new runners seem to have issues with the shared_database2 spec test.

Passing run from GitHub actions:
https://github.com/cnti-testcatalog/testsuite/actions/runs/9439225077/job/26011260848

Failing run from GitHub actions:
https://github.com/cnti-testcatalog/testsuite/actions/runs/9439225077/job/26009846416

The issue stems from the Netstat::K8s.get_multiple_pods_connected_to_mariadb_violators function, which should return IPs of two WordPress CNFs connected to the shared MariaDB. By searching for violators: in the logs, it can be seen that sometimes one of the WordPress pod IPs is not returned.

Delving deeper into the code, we can spot that the function self.get_pod_network_info_from_node_via_container_id in lib/k8s_netstat is responsible for detecting the database connections through this block of code:

# get multiple call for a larger sample
parsed_netstat = (1..10).map {
    sleep 10
    netstat = ClusterTools.exec_by_node("nsenter -t #{pid} -n netstat -n", node_name)
    Log.info { "Container Netstat: #{netstat}" }
    Netstat.parse(netstat["output"])
}.flatten.compact

Looking at this code, you can probably see that it works in a hit-or-miss manner (hoping to get hits). The netstat command is executed every 10 seconds, hoping to get all the database connections. This lucky behavior obviously does not have to occur (as can be seen in the actions).

To Reproduce

crystal spec --tag shared_database2

Expected behavior

There should be a more consistent approach to detect that a database connection has been made.

Additional context

Possible solutions:

  1. Increase the netstat attempts from 10 to X.
  2. Do a complete overhaul of the detection code by utilizing MariaDB's connection logging (general query log or something else).

I think that for the time being, a quick hack of increasing the netstat attempts could alleviate the needs of GitHub actions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant