Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show training error msg #495

Merged
merged 3 commits into from
Dec 16, 2020

Conversation

maikia
Copy link
Contributor

@maikia maikia commented Dec 15, 2020

closes #493

Adds a test checking if the training_error on aws is returned correctly (from the dispatcher perspective).
(I am not sure if it's not mocking overkill).

This should not be merged until #494 is.

@lgtm-com
Copy link

lgtm-com bot commented Dec 15, 2020

This pull request fixes 1 alert when merging 64bc1f0 into 2e12b8d - view on LGTM.com

fixed alerts:

  • 1 for Variable defined multiple times

@codecov
Copy link

codecov bot commented Dec 16, 2020

Codecov Report

Merging #495 (88d15bc) into master (f7211c4) will increase coverage by 0.03%.
The diff coverage is 98.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #495      +/-   ##
==========================================
+ Coverage   93.61%   93.65%   +0.03%     
==========================================
  Files          99       99              
  Lines        8587     8633      +46     
==========================================
+ Hits         8039     8085      +46     
  Misses        548      548              
Impacted Files Coverage Δ
ramp-engine/ramp_engine/dispatcher.py 97.61% <80.00%> (-0.57%) ⬇️
ramp-engine/ramp_engine/tests/test_aws.py 86.49% <100.00%> (+0.05%) ⬆️
ramp-engine/ramp_engine/tests/test_dispatcher.py 100.00% <100.00%> (ø)
ramp-engine/ramp_engine/base.py 93.90% <0.00%> (+1.21%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f7211c4...88d15bc. Read the comment docs.

@lgtm-com
Copy link

lgtm-com bot commented Dec 16, 2020

This pull request fixes 1 alert when merging 63544d3 into 2e12b8d - view on LGTM.com

fixed alerts:

  • 1 for Variable defined multiple times

@maikia maikia force-pushed the show_training_error_msg_cd branch from 63544d3 to b3a4537 Compare December 16, 2020 11:18
@maikia maikia changed the title WIP Show training error msg Show training error msg Dec 16, 2020
@maikia maikia marked this pull request as ready for review December 16, 2020 11:52
@lgtm-com
Copy link

lgtm-com bot commented Dec 16, 2020

This pull request fixes 1 alert when merging 88d15bc into f7211c4 - view on LGTM.com

fixed alerts:

  • 1 for Variable defined multiple times

@maikia
Copy link
Contributor Author

maikia commented Dec 16, 2020

@tomMoral could you also have a look at this one pls?

Copy link
Collaborator

@tomMoral tomMoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

else:
self._logger.info(
f'Worker {worker} killed due to an error '
f'during training: {stderr}'
)
submission_status = 'training_error'
submission_status = 'training_error'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so basically the issue was that all checking_error were set as training_error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. that was actually another issue. The issue was that the error message was set to ''

@maikia maikia merged commit e9a3c1d into paris-saclay-cds:master Dec 16, 2020
@maikia
Copy link
Contributor Author

maikia commented Dec 16, 2020

thx @tomMoral

maikia added a commit that referenced this pull request Dec 16, 2020
* adding test and correcting the training error msg

* update the tests

* cleanup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG AWS missing info on training error
2 participants