Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] MultiIndex measurement not reloaded on some cases - use of closed network connection #524

Closed
sbengo opened this issue Nov 15, 2022 · 0 comments · Fixed by #525
Closed

Comments

@sbengo
Copy link
Collaborator

sbengo commented Nov 15, 2022

After the gather refactor logic - the gather loop is done and managed by measurements instead of devices - the load index process (every device FilterFreq or measurement override) tries go refresh the current indexes

The following log entry appeared on some devices:

mydevice1.log:time="2022-04-06 10:47:01" level=error msg="Error while trying to reload Indexed Labels for baseOid .1.3.6.1.4.1.9.9.166.1.1.1.1.4 : ERROR: set udp XXX.XXX.XXX.XXX:41931: use of closed network connection" device=YY.YY.YY.YY measurement=cisco_qos_cbQosCMStats
mydevice1.log:time="2022-04-06 10:47:01" level=error msg="LOADINDEXEDLABELS - SNMP WALK error: set udp XXX.XXX.XXX.XXX:41931: use of closed network connection" device=YY.YY.YY.YY measurement=cisco_qos_cbQosCMStats

All the following entries refers to the same origin. After a deep analysis to find the root cause, it seems related with:

  • MultiIndex Measurements
  • Device totally unavailable (System OIDs) during some intervals prior to the first error message
sbengo pushed a commit that referenced this issue Nov 15, 2022
On the measurement gatherloop the measurement is marked as non-connected
if it becomes unresponsive or the metrica cannot be retrieved for some
connectivity reason.

When it happens, the gatherloop tries to reconnect again querying
complete sysinfo and resets the snmpclient

This PR tries to fix the logic of the snmpclient inherit from the
measurement, as it is using a pointer of a copy instead of the original
snmpclient.

In this case, any change on the measurement snmpclient doesn't populates
to the multiindex measurement snmpclient, and an already closed
connection was being used and the load index is not being done and all
the indexes were remain as the last time that they got gethered (wrong
if the processed failed at some point)

fix #524
sbengo added a commit that referenced this issue Nov 15, 2022
On the measurement gatherloop the measurement is marked as non-connected
if it becomes unresponsive or the metrica cannot be retrieved for some
connectivity reason.

When it happens, the gatherloop tries to reconnect again querying
complete sysinfo and resets the snmpclient

This PR tries to fix the logic of the snmpclient inherit from the
measurement, as it is using a pointer of a copy instead of the original
snmpclient.

In this case, any change on the measurement snmpclient doesn't populates
to the multiindex measurement snmpclient, and an already closed
connection was being used and the load index is not being done and all
the indexes were remain as the last time that they got gethered (wrong
if the processed failed at some point)

fix #524
sbengo added a commit that referenced this issue Nov 15, 2022
On the measurement gatherloop the measurement is marked as non-connected
if it becomes unresponsive or the metrica cannot be retrieved for some
connectivity reason.

When it happens, the gatherloop tries to reconnect again querying
complete sysinfo and resets the snmpclient

This PR tries to fix the logic of the snmpclient inherit from the
measurement, as it is using a pointer of a copy instead of the original
snmpclient.

In this case, any change on the measurement snmpclient doesn't populates
to the multiindex measurement snmpclient, and an already closed
connection was being used and the load index is not being done and all
the indexes were remain as the last time that they got gethered (wrong
if the processed failed at some point)

fix #524
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant