Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(CVE-2024-39705): bump to nltk 3.9.1; correct model download issues #3541

Merged
merged 9 commits into from
Aug 19, 2024
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
## 0.15.6-dev0
## 0.15.6

### Enhancements

### Features

### Fixes

* **Bump to NLTK 3.9.x** Bumps to the latest `nltk` version to resolve CVE.
* **Update CI for `ingest-test-fixture-update-pr` to resolve NLTK model download errors.**


Expand Down
6 changes: 3 additions & 3 deletions requirements/base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ mypy-extensions==1.0.0
# unstructured-client
nest-asyncio==1.6.0
# via unstructured-client
nltk==3.8.1
nltk==3.9.1
# via -r ./base.in
numpy==1.26.4
# via -r ./base.in
Expand Down Expand Up @@ -110,7 +110,7 @@ sniffio==1.3.1
# via
# anyio
# httpx
soupsieve==2.5
soupsieve==2.6
# via beautifulsoup4
tabulate==0.9.0
# via -r ./base.in
Expand All @@ -129,7 +129,7 @@ typing-inspect==0.9.0
# via
# dataclasses-json
# unstructured-client
unstructured-client==0.25.4
unstructured-client==0.25.5
# via
# -c ././deps/constraints.txt
# -r ./base.in
Expand Down
3 changes: 3 additions & 0 deletions requirements/deps/constraints.txt
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,6 @@ fsspec==2024.5.0
wrapt>=1.14.0

langchain-community>=0.2.5

grpcio==1.64.3
label-studio-sdk==0.0.34
4 changes: 2 additions & 2 deletions requirements/dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ pyyaml==6.0.2
# -c ./test.txt
# jupyter-events
# pre-commit
pyzmq==26.1.0
pyzmq==26.1.1
# via
# ipykernel
# jupyter-client
Expand Down Expand Up @@ -360,7 +360,7 @@ sniffio==1.3.1
# -c ./base.txt
# anyio
# httpx
soupsieve==2.5
soupsieve==2.6
# via
# -c ./base.txt
# beautifulsoup4
Expand Down
2 changes: 1 addition & 1 deletion requirements/extra-markdown.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#
importlib-metadata==8.2.0
# via markdown
markdown==3.6
markdown==3.7
# via -r ./extra-markdown.in
zipp==3.20.0
# via importlib-metadata
8 changes: 4 additions & 4 deletions requirements/extra-paddleocr.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ astor==0.8.1
# via paddlepaddle
attrdict==2.0.1
# via unstructured-paddleocr
cachetools==5.4.0
cachetools==5.5.0
# via premailer
certifi==2024.7.4
# via
Expand Down Expand Up @@ -64,13 +64,13 @@ idna==3.7
# anyio
# httpx
# requests
imageio==2.34.2
imageio==2.35.1
# via
# imgaug
# scikit-image
imgaug==0.4.0
# via unstructured-paddleocr
importlib-resources==6.4.0
importlib-resources==6.4.3
# via matplotlib
kiwisolver==1.4.5
# via matplotlib
Expand All @@ -83,7 +83,7 @@ lxml==5.3.0
# -c ./base.txt
# premailer
# unstructured-paddleocr
matplotlib==3.9.1.post1
matplotlib==3.9.2
# via imgaug
more-itertools==10.4.0
# via cssutils
Expand Down
17 changes: 9 additions & 8 deletions requirements/extra-pdf-image.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#
antlr4-python3-runtime==4.9.3
# via omegaconf
cachetools==5.4.0
cachetools==5.5.0
# via google-auth
certifi==2024.7.4
# via
Expand Down Expand Up @@ -48,7 +48,7 @@ fsspec==2024.5.0
# torch
google-api-core[grpc]==2.19.1
# via google-cloud-vision
google-auth==2.33.0
google-auth==2.34.0
# via
# google-api-core
# google-cloud-vision
Expand All @@ -58,13 +58,14 @@ googleapis-common-protos==1.63.2
# via
# google-api-core
# grpcio-status
grpcio==1.65.4
grpcio==1.64.3
# via
# -c ././deps/constraints.txt
# google-api-core
# grpcio-status
grpcio-status==1.62.3
# via google-api-core
huggingface-hub==0.24.5
huggingface-hub==0.24.6
# via
# timm
# tokenizers
Expand All @@ -76,7 +77,7 @@ idna==3.7
# via
# -c ./base.txt
# requests
importlib-resources==6.4.0
importlib-resources==6.4.3
# via matplotlib
iopath==0.1.10
# via layoutparser
Expand All @@ -92,7 +93,7 @@ lxml==5.3.0
# pikepdf
markupsafe==2.1.5
# via jinja2
matplotlib==3.9.1.post1
matplotlib==3.9.2
# via
# pycocotools
# unstructured-inference
Expand Down Expand Up @@ -120,7 +121,7 @@ onnx==1.16.2
# via
# -r ./extra-pdf-image.in
# unstructured-inference
onnxruntime==1.18.1
onnxruntime==1.19.0
# via unstructured-inference
opencv-python==4.8.0.76
# via
Expand All @@ -147,7 +148,7 @@ pdfminer-six==20231228
# via
# -r ./extra-pdf-image.in
# pdfplumber
pdfplumber==0.11.3
pdfplumber==0.11.4
# via layoutparser
pikepdf==9.1.1
# via -r ./extra-pdf-image.in
Expand Down
2 changes: 1 addition & 1 deletion requirements/huggingface.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ fsspec==2024.5.0
# -c ././deps/constraints.txt
# huggingface-hub
# torch
huggingface-hub==0.24.5
huggingface-hub==0.24.6
# via
# tokenizers
# transformers
Expand Down
4 changes: 2 additions & 2 deletions requirements/ingest/azure.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
#
adlfs==2024.7.0
# via -r ./ingest/azure.in
aiohappyeyeballs==2.3.5
aiohappyeyeballs==2.3.7
# via aiohttp
aiohttp==3.10.3
aiohttp==3.10.4
# via adlfs
aiosignal==1.3.1
# via aiohttp
Expand Down
2 changes: 1 addition & 1 deletion requirements/ingest/biomed.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ beautifulsoup4==4.12.3
# bs4
bs4==0.0.2
# via -r ./ingest/biomed.in
soupsieve==2.5
soupsieve==2.6
# via
# -c ./ingest/../base.txt
# beautifulsoup4
21 changes: 11 additions & 10 deletions requirements/ingest/chroma.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ backoff==2.2.1
# posthog
bcrypt==4.2.0
# via chromadb
cachetools==5.4.0
cachetools==5.5.0
# via google-auth
certifi==2024.7.4
# via
Expand Down Expand Up @@ -51,7 +51,7 @@ exceptiongroup==1.2.2
# via
# -c ./ingest/../base.txt
# anyio
fastapi==0.112.0
fastapi==0.112.1
# via chromadb
filelock==3.15.4
# via huggingface-hub
Expand All @@ -61,12 +61,13 @@ fsspec==2024.5.0
# via
# -c ./ingest/../deps/constraints.txt
# huggingface-hub
google-auth==2.33.0
google-auth==2.34.0
# via kubernetes
googleapis-common-protos==1.63.2
# via opentelemetry-exporter-otlp-proto-grpc
grpcio==1.65.4
grpcio==1.64.3
# via
# -c ./ingest/../deps/constraints.txt
# chromadb
# opentelemetry-exporter-otlp-proto-grpc
h11==0.14.0
Expand All @@ -76,7 +77,7 @@ h11==0.14.0
# uvicorn
httptools==0.6.1
# via uvicorn
huggingface-hub==0.24.5
huggingface-hub==0.24.6
# via tokenizers
humanfriendly==10.0
# via coloredlogs
Expand All @@ -88,7 +89,7 @@ idna==3.7
# requests
importlib-metadata==8.2.0
# via -r ./ingest/chroma.in
importlib-resources==6.4.0
importlib-resources==6.4.3
# via chromadb
kubernetes==30.1.0
# via chromadb
Expand All @@ -106,7 +107,7 @@ oauthlib==3.2.2
# via
# kubernetes
# requests-oauthlib
onnxruntime==1.18.1
onnxruntime==1.19.0
# via chromadb
opentelemetry-api==1.16.0
# via
Expand Down Expand Up @@ -192,7 +193,7 @@ sniffio==1.3.1
# -c ./ingest/../base.txt
# anyio
# httpx
starlette==0.37.2
starlette==0.38.2
# via fastapi
sympy==1.13.2
# via onnxruntime
Expand Down Expand Up @@ -231,9 +232,9 @@ urllib3==1.26.19
# -c ./ingest/../deps/constraints.txt
# kubernetes
# requests
uvicorn[standard]==0.30.5
uvicorn[standard]==0.30.6
# via chromadb
uvloop==0.19.0
uvloop==0.20.0
# via uvicorn
watchfiles==0.23.0
# via uvicorn
Expand Down
8 changes: 5 additions & 3 deletions requirements/ingest/clarifai.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,16 @@ charset-normalizer==3.3.2
# requests
clarifai==10.7.0
# via -r ./ingest/clarifai.in
clarifai-grpc==10.7.1
clarifai-grpc==10.7.2
# via clarifai
contextlib2==21.6.0
# via schema
googleapis-common-protos==1.63.2
# via clarifai-grpc
grpcio==1.65.4
# via clarifai-grpc
grpcio==1.64.3
# via
# -c ./ingest/../deps/constraints.txt
# clarifai-grpc
idna==3.7
# via
# -c ./ingest/../base.txt
Expand Down
2 changes: 1 addition & 1 deletion requirements/ingest/confluence.txt
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ six==1.16.0
# via
# -c ./ingest/../base.txt
# atlassian-python-api
soupsieve==2.5
soupsieve==2.6
# via
# -c ./ingest/../base.txt
# beautifulsoup4
Expand Down
6 changes: 3 additions & 3 deletions requirements/ingest/databricks-volumes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#
# pip-compile ./ingest/databricks-volumes.in
#
cachetools==5.4.0
cachetools==5.5.0
# via google-auth
certifi==2024.7.4
# via
Expand All @@ -15,9 +15,9 @@ charset-normalizer==3.3.2
# via
# -c ./ingest/../base.txt
# requests
databricks-sdk==0.29.0
databricks-sdk==0.30.0
# via -r ./ingest/databricks-volumes.in
google-auth==2.33.0
google-auth==2.34.0
# via databricks-sdk
idna==3.7
# via
Expand Down
4 changes: 1 addition & 3 deletions requirements/ingest/delta-table.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#
# pip-compile ./ingest/delta-table.in
#
deltalake==0.18.2
deltalake==0.19.0
# via -r ./ingest/delta-table.in
fsspec==2024.5.0
# via
Expand All @@ -16,5 +16,3 @@ numpy==1.26.4
# pyarrow
pyarrow==17.0.0
# via deltalake
pyarrow-hotfix==0.6
# via deltalake
4 changes: 2 additions & 2 deletions requirements/ingest/discord.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
#
# pip-compile ./ingest/discord.in
#
aiohappyeyeballs==2.3.5
aiohappyeyeballs==2.3.7
# via aiohttp
aiohttp==3.10.3
aiohttp==3.10.4
# via discord-py
aiosignal==1.3.1
# via aiohttp
Expand Down
6 changes: 3 additions & 3 deletions requirements/ingest/elasticsearch.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
#
# pip-compile ./ingest/elasticsearch.in
#
aiohappyeyeballs==2.3.5
aiohappyeyeballs==2.3.7
# via aiohttp
aiohttp==3.10.3
aiohttp==3.10.4
# via elasticsearch
aiosignal==1.3.1
# via aiohttp
Expand All @@ -21,7 +21,7 @@ certifi==2024.7.4
# elastic-transport
elastic-transport==8.15.0
# via elasticsearch
elasticsearch[async]==8.14.0
elasticsearch[async]==8.15.0
# via -r ./ingest/elasticsearch.in
frozenlist==1.4.1
# via
Expand Down
Loading
Loading