Skip to content

Conversation

@ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented Jan 2, 2023

What does this PR do?

I tried to launch Past CI (the 2nd round) after #20861, but there are some more fixes required: Past CI images don't install other dependencies, and we need more decorators to skip some tests if they are not installed.


past_versions_testing = {
"pytorch": {
"1.12": {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For building the Past CI docker image with torch 1.12.x

self.assertAlmostEqual(b, b1, delta=1e-5)

@slow
@require_accelerate
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test requires accelerate, which is installed in daily CI image, but not past CI images.

(So far the past CI avoids installing many other dependencies, but this could be changed in the future)

)

@require_torch
@require_pytesseract
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(So far) Past CI images don't install pytesseract

self.assertEqual(sum(tokens_with_offsets["special_tokens_mask"]), added_tokens)

@require_torch
@require_detectron2
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(So far) Past CI images don't install detectron2

import tensorflow as tf

if is_tensorflow_text_available():
from transformers.models.bert import TFBertTokenizer
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import will trigger

import tensorflow as tf

in src/transformers/models/bert/tokenization_bert_tf.py

if is_tf_available():
import tensorflow as tf

if is_keras_nlp_available():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 2, 2023

The documentation is not available anymore as the PR was closed or merged.

# Tensorflow-text-specific objects
try:
if not is_tensorflow_text_available():
if not (is_tensorflow_text_available() and is_tf_available()):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this change much, but the import of TFBertTokenizer and TFGPT2Tokenizer also requires tensorflow too.

cc @Rocketknight1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be avoid as it then screws up the dummy creation (you have an old dummy file that should be removed manually). is_tensorflow_text_available() should probably return False if is_tf_available() is false.

@ydshieh ydshieh requested a review from sgugger January 3, 2023 14:34
@ydshieh ydshieh marked this pull request as ready for review January 3, 2023 14:35
@ydshieh ydshieh requested a review from LysandreJik January 3, 2023 14:35
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing those. I'm okay with most changes except for the new checks (the existing one should check on the others basically)

# Tensorflow-text-specific objects
try:
if not is_tensorflow_text_available():
if not (is_tensorflow_text_available() and is_tf_available()):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be avoid as it then screws up the dummy creation (you have an old dummy file that should be removed manually). is_tensorflow_text_available() should probably return False if is_tf_available() is false.

if is_tf_available():
import tensorflow as tf

if is_tensorflow_text_available():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgugger As is_tensorflow_text_available already contains the check for TF, should I revert the change here? Or it's fine to keep it this way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine either way, but why this change?

Copy link
Collaborator Author

@ydshieh ydshieh Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • In current main branch, is_tensorflow_text_available condition is not inside is_tf_available. In past CI image build dockerfile, we install .[dev] then uninstall tensorflow, so tensorflow_text is there but tensorflow is removed. And this causes some tensorflow_text-related tests failed (TFBertTokenizer file will import tensorflow)

  • In the 1st version of the PR, I keep is_tensorflow_text_available as it is, and use the combination of is_tensorflow_text_available() and is_tensorflow_available(). The change at these 2 lines was necessary at that time.

  • @sgugger said we should instead change the definition of is_tensorflow_text_available to include is_tensorflow_available directly. After applying that suggestion, we no longer need to change the 2 lines here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can unindent this block if you want but honestly we don't really care. The tensorflow import needs to stay as it's used in the file.

return out["pooler_output"]


@require_tf
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question - we don't need this anymore, and can keep it is as in current main.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you @ydshieh!

if is_tf_available():
import tensorflow as tf

if is_tensorflow_text_available():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine either way, but why this change?

@ydshieh ydshieh merged commit b3a0aad into main Jan 12, 2023
@ydshieh ydshieh deleted the fix_past_ci branch January 12, 2023 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants