Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu-latest Job fails after 1 h and doesn't display logs. #2475

Closed
6 tasks
adamhass opened this issue Jan 18, 2021 · 3 comments
Closed
6 tasks

Ubuntu-latest Job fails after 1 h and doesn't display logs. #2475

adamhass opened this issue Jan 18, 2021 · 3 comments
Assignees
Labels
Area: Rust investigate Collect additional information, like space on disk, other tool incompatibilities etc. OS: Ubuntu

Comments

@adamhass
Copy link

adamhass commented Jan 18, 2021

Description
I work on Kompact, an open-source Actor-framework in Rust. We recently migrated to Github Actions from Travis. We are trying to get a Code-coverage report produced as part of the CI pipeline, so this is not a job which was running and then suddenly started failing.

The Code-coverage job fails after 1 hour. No logs are saved for/viewable for the step which it fails on. You can view four similar failing jobs here and see what I'm talking about. We have tried to ensure that it's not failing due to faulty configuration/scripting/set-up on our end.

I found a similar issue (#1491) I suspect it is the same problem.

Area for Triage: Rust

Question, Bug, or Feature?: Bug

Virtual environments affected

  • Ubuntu 16.04
  • [✅] Ubuntu 18.04
  • Ubuntu 20.04
  • macOS 10.15
  • macOS 11.0
  • Windows Server 2016 R2
  • Windows Server 2019

Expected behavior
Expecting the job to succeed, or, fail and display logs.

Actual behavior
Failure and no log for the failing step.

Repro steps
The Job config is viewable here.

The tests ran during that job run for a long time. The testing includes networked integration tests, in which multiple sockets are opened and closed, for many of the test cases, and they are run multiple times against a feature-matrix. So the issue may be related to resource usage, as the related issue alludes to, but it is difficult to troubleshoot for us without access to the logs.

@al-cheb al-cheb added Area: Rust investigate Collect additional information, like space on disk, other tool incompatibilities etc. OS: Ubuntu and removed needs triage labels Jan 18, 2021
@miketimofeev miketimofeev self-assigned this Jan 19, 2021
@miketimofeev
Copy link
Contributor

Hi @adamhass!
I've checked the provided failing jobs and 2 of them actually contain some data"

  1. Use source-based code coverage Codecov #2 — a lot of ERRO slog-async: logger dropped messages due to channel overflow messages
    https://pipelines.actions.githubusercontent.com/HIOnzKPM13Mp2mueOWpEoPDQaglt6dMP2qFrpGOn8g2OyHGIsi/_apis/pipelines/1/runs/17/signedlogcontent/3?urlExpires=2021-01-19T08%3A17%3A51.5954744Z&urlSigningMethod=HMACV1&urlSignature=yl4HUik8kFqjr6yYlr2bqUz12LO3NqMn5UNWByEJzg8%3D
  2. Use source-based code coverage Codecov #3 — pretty straightforward error No space left on device
    Could you please check how much space is usually used for the tests? This can be a reason for such behavior.

In the meantime, I'll try to find what's wrong with those 2 without logs.

@miketimofeev
Copy link
Contributor

Managed to find backend service logs for runs without any data — the lack of logs is related to the disk space usage: Failed to upload diagnostics System.IO.IOException: No space left on device

@adamhass
Copy link
Author

Thanks for the quick help. We can solve our problem with that information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Rust investigate Collect additional information, like space on disk, other tool incompatibilities etc. OS: Ubuntu
Projects
None yet
Development

No branches or pull requests

3 participants