Add log upload support for build target failed#587
Conversation
|
The question is how would we test this? Maybe, I can purposely break purposely in the build and see how it behaves. |
|
Initial test is passed. |
There was a problem hiding this comment.
Why create an empty directory and then upload it?
There was a problem hiding this comment.
What if it does not exist? Isn't it possible?
There was a problem hiding this comment.
If there aren't any files to upload, don't upload any files?
Generally, think through the process here and what the state should be at each step:
- Job starts. Do we set up some metadata here and create a folder in S3 with that metadata now? Did a prior job already do that and are we just writing into an already established location?
- Job installs requirements, initializes caches, etc.
- Job starts building. Build logs start to get produced. Build artifacts start to get produced.
- Job finishes building. Artifacts and logs are uploaded.
What should happen if the build fails? Should we upload partial artifacts? What about logs?
What should happen if the job is cancelled? Should we upload partial artifacts? What about logs?
Notice that with log (and artifact) streaming, the answers change.
There was a problem hiding this comment.
What should happen if the build fails? Should we upload partial artifacts? What about logs?
What should happen if the job is cancelled? Should we upload partial artifacts? What about logs?
- So if the build fails in the beginning, there would not be any log to upload - I assume the time even the
build/logsdoes not exist - If the build fails in the middle, there must be some log to upload so
build/logsexists and upload whatever it exist for logs. - If the job is canceled, in the middle of the building - we still should upload what we have as log.
There was a problem hiding this comment.
I am searching the case 3 if it is possible - but @ScottTodd what is your opinion?
There was a problem hiding this comment.
My understanding from my small research, always() does work for the user triggered cancelations If I understand correctly.
Causes the step to always execute, and returns true, even when canceled. The always expression is best used at the step level or on tasks that you expect to run even when a job is canceled. For example, you can use always to send logs even when a job is canceled.
There was a problem hiding this comment.
Tested the canceling case - worked fine and uploaded the logs: https://github.com/ROCm/TheRock/actions/runs/14985473348/job/42098511841
5452f72 to
1ec90cd
Compare
|
Tested the normal working case: |
marbre
left a comment
There was a problem hiding this comment.
There are some issues that need to be addressed before this can be merged. You might want to ask @ScottTodd for approval if you need another review today.
5c12c12 to
87b107d
Compare
|
@ScottTodd @marbre @marbre, currently, |
It does succeed on the main branch on push, see https://github.com/ROCm/TheRock/actions/runs/15031073217/job/42243505016. However, there is obviously a permission issue when manually dispatching the workflow. This should be fixed with #622 but shouldn't block to add Windows specific adjustments to this PR anyway. |
75f2fba to
5a16be1
Compare
|
Aside: I'm seeing lots of formatting commits pushed individually. If you locally set up pre-commit (or the specific formatters manually), that will let you automatically format your commits before you push. |
Yes, I will figure out automatically call black formatting. |
#622 helps a lot - making progress on Windows side. |
bf66621 to
c1bc0a1
Compare
6b64687 to
eacea5b
Compare
ScottTodd
left a comment
There was a problem hiding this comment.
Workflow changes LGTM. Just some Python style comments now.
|
When responding to review feedback, you can batch your commit pushes to avoid re-triggering the CI so many times. Each push starts several hours of build jobs. Pushes to cancel previous jobs, but CI time is not cheap. |
Sure thing I can do that. |
|
Final test after the last changes: |
ScottTodd
left a comment
There was a problem hiding this comment.
Looks good enough to me now. A few lingering comments for future work.
| def normalize_path(p: Path) -> str: | ||
| return str(p).replace("\\", "/") if is_windows() else str(p) |
There was a problem hiding this comment.
You can probably use as_posix() here instead of defining a new helper function, if the indexer script really needs a unix style path. Fine as-is for now.
There was a problem hiding this comment.
Ohh - did not know that.
There was a problem hiding this comment.
Instead of forking indexer.py I propose to rather implement a solution based on boto3 which can then be used in an AWS Lambda.
This PR adds the log upload support when the build target failed within the workflow. It is not using
teatime.pyscript