Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependencies v0.57 #578

Merged
merged 9 commits into from
Mar 14, 2022
Merged

Update dependencies v0.57 #578

merged 9 commits into from
Mar 14, 2022

Conversation

juhoinkinen
Copy link
Member

Dependency updates for Annif v0.57.

Updates to:

I don't think there are other dependencies that can be upgraded (to keep Python 3.6-3.9 support) or are worth upgrading now, or are there some?

Still needs to check that old Omikuji and NN ensemble models keep working, but I would suppose so.

@juhoinkinen juhoinkinen added this to the 0.57 milestone Mar 10, 2022
@codecov
Copy link

codecov bot commented Mar 10, 2022

Codecov Report

Merging #578 (62856fa) into master (b679882) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #578   +/-   ##
=======================================
  Coverage   99.47%   99.47%           
=======================================
  Files          84       84           
  Lines        5568     5568           
=======================================
  Hits         5539     5539           
  Misses         29       29           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b679882...62856fa. Read the comment docs.

@juhoinkinen
Copy link
Member Author

I checked that Omikuji and NN ensemble models trained on the previous versions (of Annif 0.56) keep working.

@juhoinkinen juhoinkinen marked this pull request as ready for review March 11, 2022 08:39
@osma
Copy link
Member

osma commented Mar 11, 2022

Now that Connexion development seems to have picked up again (great!), would it make sense to pin the version we use, to avoid a future update of Connexion breaking something?

@juhoinkinen
Copy link
Member Author

Now that Connexion development seems to have picked up again (great!), would it make sense to pin the version we use, to avoid a future update of Connexion breaking something?

Yes, could be so. I'll pin Connexion to 2.12.x.

@osma
Copy link
Member

osma commented Mar 11, 2022

There are plans for a new major version (presumably 3.0) of Connexion, which would drop support for Flask 1.x, according to the discussion in this recent PR: spec-first/connexion#1465
That could break things for us, although I'm not sure.
So I suggest we should pin Connexion to 2.12.* or 2.*

@juhoinkinen
Copy link
Member Author

Hmh, there are pip errors in the logs of the install step that do not make the step fail.

On Python 3.8 after spacy install:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
annif 0.57.0.dev0 requires numpy==1.21.*, but you have numpy 1.22.3 which is incompatible.
Successfully installed en-core-web-sm-3.2.0 numpy-1.22.3

This is caused due to the eager upgrade-strategy (checked on my laptop), which I added to make sure old dependencies are not used from pip cache. Better to remove the eager strategy I think.

On Python 3.7:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
flake8 4.0.1 requires importlib-metadata<4.3; python_version < "3.8", but you have importlib-metadata 4.11.2 which is incompatible.

Probably also because of the eager strategy...

@osma
Copy link
Member

osma commented Mar 11, 2022

This is caused due to the eager upgrade-strategy (checked on my laptop), which I added to make sure old dependencies are not used from pip cache. Better to remove the eager strategy I think.

The way I read it, it seems that when spacy is installed, numpy 1.21.x is also for some reason upgraded to 1.22.x and then pip complains about not matching numpy 1.21.x which Annif depends on.

It's not obvious to me why pip decides to upgrade numpy at the time of installing spacy. Spacy only requires numpy>=1.15.0: https://github.com/explosion/spaCy/blob/297dd82c86372c7aa0a181e55dc72512718aafe8/requirements.txt#L16

In particular the lines 428-430 in the Python 3.8 test output you linked to are puzzling:

Requirement already satisfied: numpy>=1.15.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from spacy<3.3.0,>=3.2.0->en-core-web-sm==3.2.0) (1.21.5)
Collecting numpy>=1.15.0
  Using cached numpy-1.22.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.8 MB)

Maybe this has to do with the eager strategy you mentioned? Would it help to adjust the pip install command that installs spacy somehow, so it won't install numpy 1.22.x?

@juhoinkinen
Copy link
Member Author

These conflicts do not fail the GH Actions install step because the exit code of pip install is zero even for conflicting dependencies, but (as mentioned in the thread) after installation pip check could be used to see if there are conflicts and error out if there are (tested here for the current case).

Should have the pip check, it would make the conflicts detectable by the CI pipeline, instead of showing them on the logs only? (Could be another PR.)

@juhoinkinen
Copy link
Member Author

It's not obvious to me why pip decides to upgrade numpy at the time of installing spacy. Spacy only requires numpy>=1.15.0: https://github.com/explosion/spaCy/blob/297dd82c86372c7aa0a181e55dc72512718aafe8/requirements.txt#L16

In particular the lines 428-430 in the Python 3.8 test output you linked to are puzzling:

Requirement already satisfied: numpy>=1.15.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from spacy<3.3.0,>=3.2.0->en-core-web-sm==3.2.0) (1.21.5)
Collecting numpy>=1.15.0
  Using cached numpy-1.22.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.8 MB)

Maybe this has to do with the eager strategy you mentioned? Would it help to adjust the pip install command that installs spacy somehow, so it won't install numpy 1.22.x?

Actually the conflict arised when installing the spacy en_core_web_sm model. Seems that the model depends on numpy, and with the eager upgrade strategy, when the spacy downloader (or direct pip call) installed only the model, also numpy got upgraded (makes sense actually). But now, when the model and spacy itself as Annif's extra module are both installed in one pip install call, pip knows that also Annif's depedencies should be satisfied.

Now, as the spacy model is installed with pip instead of the spacy downloader, the version of the model needs to be fixed. I don't know whether that's good or bad.

On Python 3.7:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
flake8 4.0.1 requires importlib-metadata<4.3; python_version < "3.8", but you have importlib-metadata 4.11.2 which is incompatible.

This other conflict has actually been around from January and is not related to eager strategy.

@osma
Copy link
Member

osma commented Mar 11, 2022

Now, as the spacy model is installed with pip instead of the spacy downloader, the version of the model needs to be fixed. I don't know whether that's good or bad.

That's a bit unfortunate. I think it would be better to use the spacy download command, as it's the official way of installing spaCy models and also what we recommend in Annif documentation.

Would it be possible to disable the eager strategy just for the spacy download command?

@sonarcloud
Copy link

sonarcloud bot commented Mar 11, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@juhoinkinen
Copy link
Member Author

Now, as the spacy model is installed with pip instead of the spacy downloader, the version of the model needs to be fixed. I don't know whether that's good or bad.

That's a bit unfortunate. I think it would be better to use the spacy download command, as it's the official way of installing spaCy models and also what we recommend in Annif documentation.

Would it be possible to disable the eager strategy just for the spacy download command?

Yes, the --upgrade-strategy only-if-needed option worked for spacy download as for pip install.

@juhoinkinen juhoinkinen merged commit 3629139 into master Mar 14, 2022
@juhoinkinen juhoinkinen deleted the update-dependencies-v0.57 branch March 14, 2022 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants