Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support jobs parameter in train command #512

Merged
merged 4 commits into from
Aug 13, 2021
Merged

Support jobs parameter in train command #512

merged 4 commits into from
Aug 13, 2021

Conversation

osma
Copy link
Member

@osma osma commented Aug 13, 2021

This PR adds a new jobs parameter for the annif train command, similar to the one already implemented in eval and hyperopt commands. The intent is to make it easier to control the amount of threads/CPUs used for training, for those backends that can make use of parallel processing during training.

The PR adds this support to the fasttext and Omikuji backends, for which this was simple to implement.

Adding as draft PR for now, since this needs more testing and QA tool checks.

This is a sideshoot of PR #511 which will benefit from this, but I thought it was cleaner to implement it as a separate PR.

@codecov
Copy link

codecov bot commented Aug 13, 2021

Codecov Report

Merging #512 (884f01f) into master (483f047) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #512   +/-   ##
=======================================
  Coverage   99.51%   99.51%           
=======================================
  Files          82       82           
  Lines        5774     5786   +12     
=======================================
+ Hits         5746     5758   +12     
  Misses         28       28           
Impacted Files Coverage Δ
annif/backend/backend.py 100.00% <100.00%> (ø)
annif/backend/ensemble.py 100.00% <100.00%> (ø)
annif/backend/fasttext.py 97.82% <100.00%> (+0.04%) ⬆️
annif/backend/maui.py 100.00% <100.00%> (ø)
annif/backend/mllm.py 100.00% <100.00%> (ø)
annif/backend/nn_ensemble.py 99.24% <100.00%> (ø)
annif/backend/omikuji.py 98.76% <100.00%> (ø)
annif/backend/pav.py 98.90% <100.00%> (ø)
annif/backend/stwfsa.py 100.00% <100.00%> (ø)
annif/backend/svc.py 100.00% <100.00%> (ø)
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 483f047...884f01f. Read the comment docs.

@osma osma self-assigned this Aug 13, 2021
@osma osma added this to the 0.54 milestone Aug 13, 2021
@sonarcloud
Copy link

sonarcloud bot commented Aug 13, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@osma
Copy link
Member Author

osma commented Aug 13, 2021

Fixed test coverage. Tested this with fasttext and omikuji backends; by default both will use all CPUs, but setting the --jobs parameter controls the number of CPUs as it should.

Ready for review.

@osma osma marked this pull request as ready for review August 13, 2021 07:53
@osma osma requested a review from juhoinkinen August 13, 2021 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants