Skip to content

[receiver/filelog] Suppress repeated permission denied errors#44350

Merged
andrzej-stencel merged 17 commits into
open-telemetry:mainfrom
khpeet:main
Jan 23, 2026
Merged

[receiver/filelog] Suppress repeated permission denied errors#44350
andrzej-stencel merged 17 commits into
open-telemetry:mainfrom
khpeet:main

Conversation

@khpeet
Copy link
Copy Markdown
Contributor

@khpeet khpeet commented Nov 18, 2025

Description

Added logic to suppress repeated error logs due to file permission errors within the filelog receiver.

  • Updated makeFingerprint to check for permission errors via os.isPermission
  • Log an error only on the first occurrence for a given file path and add it to an in-memory map, guarded by a mutex to avoid race conditions for concurrent read/writes.
  • Debug log only for subsequent read attempt (to avoid spam)
  • When the file becomes readable, info log a message and remove the path from the stored set

Link to tracking issue

Fixes #39491

Testing

New unit test verifying:

  • First unreadable-open produces exactly one Error-level log.
  • Second attempt does not produce another Error-level log.
  • When permissions are restored, an Info-level Previously unreadable... message appears.

NOTE: Test skips Windows due to unreliable os.Chmod behavior

Documentation

Added changelog yaml entry summarizing the issue and fix.

@khpeet khpeet requested review from a team and andrzej-stencel as code owners November 18, 2025 13:27
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented Nov 18, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@github-actions github-actions Bot added the first-time contributor PRs made by new contributors label Nov 18, 2025
@github-actions
Copy link
Copy Markdown
Contributor

Welcome, contributor! Thank you for your contribution to opentelemetry-collector-contrib.

Important reminders:

A maintainer will review your pull request soon. Thank you for helping make OpenTelemetry better!

Comment thread pkg/stanza/fileconsumer/file.go Outdated
@khpeet khpeet requested a review from atoulme November 19, 2025 21:26
@khpeet
Copy link
Copy Markdown
Contributor Author

khpeet commented Nov 24, 2025

@andrzej-stencel @atoulme - gentle bump on this PR

Copy link
Copy Markdown
Contributor

@VihasMakwana VihasMakwana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a mutex? AFAIK, calls to makeFingerprint are serialised. Could you confirm this?

Also, @andrzej-stencel is on sick leave until starting of December. It might take some time to get this merged.

@khpeet
Copy link
Copy Markdown
Contributor Author

khpeet commented Nov 25, 2025

Do we need a mutex? AFAIK, calls to makeFingerprint are serialised. Could you confirm this?

Also, @andrzej-stencel is on sick leave until starting of December. It might take some time to get this merged.

Hey @VihasMakwana - You are correct! Taking a second deeper look, I saw that only a single goroutine calls poll() and everything looks sequential. I've removed the mutex.

@github-actions
Copy link
Copy Markdown
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions
Copy link
Copy Markdown
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions Bot added the Stale label Jan 12, 2026
@khpeet
Copy link
Copy Markdown
Contributor Author

khpeet commented Jan 12, 2026

@andrzej-stencel - Happy New Year! Can we get this one approved/merged?

Copy link
Copy Markdown
Contributor

@iblancasa iblancasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add tests to:

  • Check the log messages not related to permissions -file not found, for instance- are still logged
  • (maybe) Test concurrent poll cycles accessing unreadable map

Comment thread pkg/stanza/fileconsumer/file.go
Comment thread pkg/stanza/fileconsumer/file.go Outdated
Comment thread pkg/stanza/fileconsumer/file.go Outdated
Comment thread pkg/stanza/fileconsumer/file.go Outdated
Comment thread pkg/stanza/fileconsumer/unreadable_test.go Outdated
@github-actions github-actions Bot removed the Stale label Jan 13, 2026
@khpeet
Copy link
Copy Markdown
Contributor Author

khpeet commented Jan 13, 2026

@iblancasa - Fixed most of the feedback (appreciate the review btw!). I also added another test to verify non-permission errors (i.e - file not found) are still logged + it confirms that those kind of errors are not tracked in the unreadable map.

For the concurrent access test, I don't think its needed since makeFingerprint calls are serialized and only a single goroutine calls poll() (everything looks to be sequential).

Copy link
Copy Markdown
Member

@andrzej-stencel andrzej-stencel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test introduced in this PR seems to be failing: https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/21181607334/job/61157176417?pr=44350

Can you take a look at this? And a couple other rather minor comments.

Comment thread pkg/stanza/fileconsumer/file.go Outdated
Comment thread .chloggen/issue-39491.yaml Outdated
Comment thread pkg/stanza/fileconsumer/file.go Outdated
@khpeet
Copy link
Copy Markdown
Contributor Author

khpeet commented Jan 22, 2026

The test introduced in this PR seems to be failing: https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/21181607334/job/61157176417?pr=44350

Can you take a look at this? And a couple other rather minor comments.

This was due to the upstream logging line being back from merging that to my fork. Updated and tests should work now

@andrzej-stencel andrzej-stencel merged commit 216e339 into open-telemetry:main Jan 23, 2026
191 checks passed
@otelbot
Copy link
Copy Markdown
Contributor

otelbot Bot commented Jan 23, 2026

Thank you for your contribution @khpeet! 🎉 We would like to hear from you about your experience contributing to OpenTelemetry by taking a few minutes to fill out this survey. If you are getting started contributing, you can also join the CNCF Slack channel #opentelemetry-new-contributors to ask for guidance and get help.

cjksplunk added a commit to cjksplunk/opentelemetry-collector-contrib that referenced this pull request Jan 23, 2026
* Adding the query and tracking of stored procedures to events

* Fixed missing process id

* Removing stored procedure type
Updating tests

* Added changelog entry

* Added changelog entry

* reverted changelog entry

* fix changelog entry

* updates to variable naming and usage

* updates to variable naming and usage

* updates to variable naming and usage

* ran make generate

* Moved QuerySampleCount and TopQueryCount ot of params and reference config obj directly

* Added issue to changelog

* Replaced values with vars

* Integrating stored procedure values for samples

* updates to test data

* Backout unintended merge

* updating change log

* Fix for inaccurate query count being reported in postgresql query metrics collection

* change log

* this fix is for sqlserver

* fill issue details

* go generate

* fix lint

* make gogci

* tweak to samples sql

* Revert checks around sending metrics for countInDb=1, since it's addressing a negligable edge case.

* Updates in prep for testing

* fix lint

* cleanup

* Fix invalid casting and update old test

* Updating topN query to use HAVING instead of external WHERE clause

* Query Tuning

* Adding '''WHERE st.text IS NOT NULL; ''' to topx and samples queries

* Backed out changes to samples query

* Adding documentation for default samples count

* Revert "Add stored procedure columns to events (#1)"

This reverts commit 75b17b2.

* Reapply "Add stored procedure columns to events (#1)"

This reverts commit 135c988.

* removing not null from samples again

* [chore][processor/resourcedetection] Add E2E tests for upcloud detector (open-telemetry#45584)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Add E2E tests for Upcloud detector

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Relates to open-telemetry#24671

---------

Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>

* [chore][processor/resourcedetection] Add E2E tests for vultr detector (open-telemetry#45583)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Add E2E tests for Vultr detector

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Relates to open-telemetry#24671

Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>

* [chore][processor/resourcedetection] Add E2E tests for ec2 detector (open-telemetry#45438)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Add E2E tests for EC2 detector

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Relates to open-telemetry#24671

Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>

* [chore][processor/resourcedetection] Add E2E tests for scaleway detector (open-telemetry#45590)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Add E2E tests for scaleway detector

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Relates to open-telemetry#24671

---------

Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>

* [chore][processor/resourcedetection] Add E2E tests for azure detector (open-telemetry#45569)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Add E2E tests for Azure detector

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Relates to open-telemetry#24671

---------

Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>

* Update Move functions in routing connector to not use pointer. (open-telemetry#45563)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes

<!--Describe what testing was performed and which tests were added.-->
#### Testing

<!--Describe the documentation added.-->
#### Documentation

<!--Please delete paragraphs that you did not use before submitting.-->

* [receiver/filelog] Suppress repeated permission denied errors (open-telemetry#44350)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
Added logic to suppress repeated error logs due to file permission
errors within the `filelog` receiver.

- Updated `makeFingerprint` to check for permission errors via
[os.isPermission](https://pkg.go.dev/os#IsPermission)
- Log an error only on the first occurrence for a given file path and
add it to an in-memory map, guarded by a mutex to avoid race conditions
for concurrent read/writes.
- Debug log only for subsequent read attempt (to avoid spam)
- When the file becomes readable, info log a message and remove the path
from the stored set

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes
open-telemetry#39491

<!--Describe what testing was performed and which tests were added.-->
#### Testing
New unit test verifying:
- First unreadable-open produces exactly one Error-level log.
- Second attempt does not produce another Error-level log.
- When permissions are restored, an Info-level `Previously
unreadable...` message appears.

NOTE: Test skips Windows due to unreliable os.Chmod behavior

<!--Describe the documentation added.-->
#### Documentation
Added changelog yaml entry summarizing the issue and fix.

<!--Please delete paragraphs that you did not use before submitting.-->

* [pkg/stanza/trim] Improve performance of Trailing and Leading (open-telemetry#45281)

#### Description

Improves the performance of `Trailing` and `Leading` functions. You can
see it in the benchmarks results here:

```
goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/trim
cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
                                    │ Previous code │               This PR                │
                                    │    sec/op     │    sec/op     vs base                │
TrimFuncs/leading/small_clean-16      20.345n ± 23%   2.924n ± 18%  -85.63% (p=0.000 n=10)
TrimFuncs/leading/small_dirty-16       24.65n ± 30%   12.77n ± 10%  -48.22% (p=0.000 n=10)
TrimFuncs/leading/all_space-16         25.91n ± 13%   13.38n ±  8%  -48.35% (p=0.000 n=10)
TrimFuncs/leading/large_dirty-16      21.440n ± 15%   8.457n ±  7%  -60.56% (p=0.000 n=10)
TrimFuncs/trailing/small_clean-16     16.400n ±  8%   2.737n ±  3%  -83.31% (p=0.000 n=10)
TrimFuncs/trailing/small_dirty-16      18.88n ± 14%   10.37n ±  8%  -45.10% (p=0.000 n=10)
TrimFuncs/trailing/all_space-16        22.61n ±  5%   12.10n ±  7%  -46.50% (p=0.000 n=10)
TrimFuncs/trailing/large_dirty-16     19.085n ± 10%   7.271n ± 22%  -61.90% (p=0.000 n=10)
TrimFuncs/whitespace/small_clean-16   29.895n ±  4%   3.788n ±  3%  -87.33% (p=0.000 n=10)
TrimFuncs/whitespace/small_dirty-16    41.10n ± 11%   17.20n ±  3%  -58.16% (p=0.000 n=10)
TrimFuncs/whitespace/all_space-16      24.47n ±  7%   13.22n ± 17%  -45.97% (p=0.000 n=10)
TrimFuncs/whitespace/large_dirty-16    36.59n ± 15%   12.82n ± 15%  -64.95% (p=0.000 n=10)
TrimFuncs/nop/small_clean-16           2.071n ±  8%   1.937n ±  7%        ~ (p=0.052 n=10)
TrimFuncs/nop/small_dirty-16           2.048n ±  7%   1.939n ±  4%        ~ (p=0.063 n=10)
TrimFuncs/nop/all_space-16             2.346n ± 10%   2.022n ±  4%  -13.83% (p=0.000 n=10)
TrimFuncs/nop/large_dirty-16           2.355n ±  9%   2.175n ± 12%   -7.62% (p=0.023 n=10)
WithFunc/baseline_scanlines-16         215.8n ± 11%   219.5n ±  3%        ~ (p=0.755 n=10)
WithFunc/with_whitespace-16            237.3n ± 12%   237.9n ± 13%        ~ (p=0.912 n=10)
ToLength/under_limit-16                241.7n ± 12%   267.3n ± 17%  +10.57% (p=0.002 n=10)
ToLength/over_limit-16                 266.9n ± 12%   247.1n ± 20%        ~ (p=0.393 n=10)
geomean                                23.72n         12.33n        -48.00%

                                    │ Previous code │               This PR               │
                                    │     B/op      │    B/op     vs base                 │
TrimFuncs/leading/small_clean-16       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/leading/small_dirty-16       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/leading/all_space-16         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/leading/large_dirty-16       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/trailing/small_clean-16      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/trailing/small_dirty-16      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/trailing/all_space-16        0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/trailing/large_dirty-16      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/whitespace/small_clean-16    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/whitespace/small_dirty-16    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/whitespace/all_space-16      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/whitespace/large_dirty-16    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/nop/small_clean-16           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/nop/small_dirty-16           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/nop/all_space-16             0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/nop/large_dirty-16           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
WithFunc/baseline_scanlines-16         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
WithFunc/with_whitespace-16            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ToLength/under_limit-16                0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ToLength/over_limit-16                 0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                           ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │ Previous code │               This PR               │
                                    │   allocs/op   │ allocs/op   vs base                 │
TrimFuncs/leading/small_clean-16       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/leading/small_dirty-16       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/leading/all_space-16         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/leading/large_dirty-16       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/trailing/small_clean-16      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/trailing/small_dirty-16      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/trailing/all_space-16        0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/trailing/large_dirty-16      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/whitespace/small_clean-16    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/whitespace/small_dirty-16    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/whitespace/all_space-16      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/whitespace/large_dirty-16    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/nop/small_clean-16           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/nop/small_dirty-16           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/nop/all_space-16             0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
TrimFuncs/nop/large_dirty-16           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
WithFunc/baseline_scanlines-16         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
WithFunc/with_whitespace-16            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ToLength/under_limit-16                0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ToLength/over_limit-16                 0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                           ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean
```

* [ci] Ensure github token is available for github cli (open-telemetry#45602)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

I enabled the usage of the `gh` CLI in
open-telemetry#45477,
but I forgot to ensure the variable `GH_TOKEN` is configured with the
proper value.

This will fix that issue, but the workflow should be triggered manually
after this merge.

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes open-telemetry#45600

Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>

* Updating documentation

---------

Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>
Co-authored-by: sreenathv <sreenathv@splunk.com>
Co-authored-by: Paulo Dias <44772900+paulojmdias@users.noreply.github.com>
Co-authored-by: Vengal rao <vengalraoguttha@users.noreply.github.com>
Co-authored-by: Keagan Peet <nobrac34@gmail.com>
Co-authored-by: Constança Manteigas <113898685+constanca-m@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Repeated and frequent "Failed to open file" on a file it can't read.

7 participants