Support jobs parameter in train command #512

osma · 2021-08-13T07:15:19Z

This PR adds a new jobs parameter for the annif train command, similar to the one already implemented in eval and hyperopt commands. The intent is to make it easier to control the amount of threads/CPUs used for training, for those backends that can make use of parallel processing during training.

The PR adds this support to the fasttext and Omikuji backends, for which this was simple to implement.

Adding as draft PR for now, since this needs more testing and QA tool checks.

This is a sideshoot of PR #511 which will benefit from this, but I thought it was cleaner to implement it as a separate PR.

codecov · 2021-08-13T07:15:49Z

Codecov Report

Merging #512 (884f01f) into master (483f047) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #512   +/-   ##
=======================================
  Coverage   99.51%   99.51%           
=======================================
  Files          82       82           
  Lines        5774     5786   +12     
=======================================
+ Hits         5746     5758   +12     
  Misses         28       28

Impacted Files	Coverage Δ
annif/backend/backend.py	`100.00% <100.00%> (ø)`
annif/backend/ensemble.py	`100.00% <100.00%> (ø)`
annif/backend/fasttext.py	`97.82% <100.00%> (+0.04%)`	⬆️
annif/backend/maui.py	`100.00% <100.00%> (ø)`
annif/backend/mllm.py	`100.00% <100.00%> (ø)`
annif/backend/nn_ensemble.py	`99.24% <100.00%> (ø)`
annif/backend/omikuji.py	`98.76% <100.00%> (ø)`
annif/backend/pav.py	`98.90% <100.00%> (ø)`
annif/backend/stwfsa.py	`100.00% <100.00%> (ø)`
annif/backend/svc.py	`100.00% <100.00%> (ø)`
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 483f047...884f01f. Read the comment docs.

sonarcloud · 2021-08-13T07:47:08Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
0.0% Duplication

osma · 2021-08-13T07:53:12Z

Fixed test coverage. Tested this with fasttext and omikuji backends; by default both will use all CPUs, but setting the --jobs parameter controls the number of CPUs as it should.

Ready for review.

osma added 3 commits August 13, 2021 09:56

Add jobs parameter to train command (doesn't do anything yet)

c55eff6

support jobs parameter for train command in fasttext backend

bae53a8

support jobs parameter for train command in omikuji backend

4613fd4

modify fasttext unit test so it also tests a non-default jobs parameter

884f01f

osma self-assigned this Aug 13, 2021

osma added the enhancement label Aug 13, 2021

osma added this to the 0.54 milestone Aug 13, 2021

osma marked this pull request as ready for review August 13, 2021 07:53

osma requested a review from juhoinkinen August 13, 2021 07:53

juhoinkinen approved these changes Aug 13, 2021

View reviewed changes

osma merged commit b872b47 into master Aug 13, 2021

osma deleted the feature-train-jobs branch August 13, 2021 14:00

osma mentioned this pull request Aug 13, 2021

Process training docs in parallel in MLLM backend #511

Merged

osma mentioned this pull request Feb 4, 2022

Parallelize suggest operations during nn_ensemble training #429

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support jobs parameter in train command #512

Support jobs parameter in train command #512

osma commented Aug 13, 2021 •

edited

Loading

codecov bot commented Aug 13, 2021 •

edited

Loading

sonarcloud bot commented Aug 13, 2021

osma commented Aug 13, 2021

Support jobs parameter in train command #512

Support jobs parameter in train command #512

Conversation

osma commented Aug 13, 2021 • edited Loading

codecov bot commented Aug 13, 2021 • edited Loading

Codecov Report

sonarcloud bot commented Aug 13, 2021

osma commented Aug 13, 2021

osma commented Aug 13, 2021 •

edited

Loading

codecov bot commented Aug 13, 2021 •

edited

Loading