-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Fix past CI #20967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix past CI #20967
Conversation
|
|
||
| past_versions_testing = { | ||
| "pytorch": { | ||
| "1.12": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For building the Past CI docker image with torch 1.12.x
| self.assertAlmostEqual(b, b1, delta=1e-5) | ||
|
|
||
| @slow | ||
| @require_accelerate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test requires accelerate, which is installed in daily CI image, but not past CI images.
(So far the past CI avoids installing many other dependencies, but this could be changed in the future)
| ) | ||
|
|
||
| @require_torch | ||
| @require_pytesseract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(So far) Past CI images don't install pytesseract
| self.assertEqual(sum(tokens_with_offsets["special_tokens_mask"]), added_tokens) | ||
|
|
||
| @require_torch | ||
| @require_detectron2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(So far) Past CI images don't install detectron2
| import tensorflow as tf | ||
|
|
||
| if is_tensorflow_text_available(): | ||
| from transformers.models.bert import TFBertTokenizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import will trigger
import tensorflow as tf
in src/transformers/models/bert/tokenization_bert_tf.py
| if is_tf_available(): | ||
| import tensorflow as tf | ||
|
|
||
| if is_keras_nlp_available(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
|
The documentation is not available anymore as the PR was closed or merged. |
src/transformers/__init__.py
Outdated
| # Tensorflow-text-specific objects | ||
| try: | ||
| if not is_tensorflow_text_available(): | ||
| if not (is_tensorflow_text_available() and is_tf_available()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this change much, but the import of TFBertTokenizer and TFGPT2Tokenizer also requires tensorflow too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be avoid as it then screws up the dummy creation (you have an old dummy file that should be removed manually). is_tensorflow_text_available() should probably return False if is_tf_available() is false.
sgugger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing those. I'm okay with most changes except for the new checks (the existing one should check on the others basically)
src/transformers/__init__.py
Outdated
| # Tensorflow-text-specific objects | ||
| try: | ||
| if not is_tensorflow_text_available(): | ||
| if not (is_tensorflow_text_available() and is_tf_available()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be avoid as it then screws up the dummy creation (you have an old dummy file that should be removed manually). is_tensorflow_text_available() should probably return False if is_tf_available() is false.
| if is_tf_available(): | ||
| import tensorflow as tf | ||
|
|
||
| if is_tensorflow_text_available(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sgugger As is_tensorflow_text_available already contains the check for TF, should I revert the change here? Or it's fine to keep it this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine either way, but why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
In current
mainbranch,is_tensorflow_text_availablecondition is not insideis_tf_available. In past CI image build dockerfile, we install.[dev]then uninstalltensorflow, sotensorflow_textis there buttensorflowis removed. And this causes some tensorflow_text-related tests failed (TFBertTokenizerfile will importtensorflow) -
In the 1st version of the PR, I keep
is_tensorflow_text_availableas it is, and use the combination ofis_tensorflow_text_available() and is_tensorflow_available(). The change at these 2 lines was necessary at that time. -
@sgugger said we should instead change the definition of
is_tensorflow_text_availableto includeis_tensorflow_availabledirectly. After applying that suggestion, we no longer need to change the 2 lines here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can unindent this block if you want but honestly we don't really care. The tensorflow import needs to stay as it's used in the file.
| return out["pooler_output"] | ||
|
|
||
|
|
||
| @require_tf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question - we don't need this anymore, and can keep it is as in current main.
LysandreJik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thank you @ydshieh!
| if is_tf_available(): | ||
| import tensorflow as tf | ||
|
|
||
| if is_tensorflow_text_available(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine either way, but why this change?
What does this PR do?
I tried to launch Past CI (the 2nd round) after #20861, but there are some more fixes required: Past CI images don't install other dependencies, and we need more decorators to skip some tests if they are not installed.