Skip to content

Conversation

@jvamvas
Copy link
Contributor

@jvamvas jvamvas commented Dec 29, 2022

Add the X-MOD models released with the paper Lifting the Curse of Multilinguality by Pre-training Modular Transformers.

Implementation notes

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 29, 2022

The documentation is not available anymore as the PR was closed or merged.

@jvamvas jvamvas marked this pull request as draft December 29, 2022 16:58
@jvamvas jvamvas changed the title [WIP] Add X-MOD Add X-MOD Jan 2, 2023
@jvamvas jvamvas marked this pull request as ready for review January 2, 2023 09:07
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks super clean to me! Thanks a lot for your huge work and adding all those models!
I left a couple of comments, mostly nits and open questions!
We should be really close merging this!

@jvamvas
Copy link
Contributor Author

jvamvas commented Jan 24, 2023

@younesbelkada Thank you for the swift code review, much appreciated!
I have now implemented your comments.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me! Thanks for your work on this!
handling now the PR to @ArthurZucker & @sgugger for final approvals and reviews!

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding this new model! My two main comments are around naming (see first comment below but please switch all XMOD to Xmod in class names) and type annotations. While we welcome them in signatures, in the code itself they are usually redundant if names are aptly chosen, and we don't use the, in the rest of the Transformers codebase.

@jvamvas
Copy link
Contributor Author

jvamvas commented Jan 25, 2023

@sgugger Thanks for the review. Your suggestions have now been implemented

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean overall! Some tests are missing, but overall very impressed by the good use of copied from! 😉

Comment on lines +28 to +36
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the models are from META, we should probably move them and update this before merging

Comment on lines +170 to +190
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 😉

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is not really an Output but rather a projection, we could have renamed this, but I suppose it follow what was done in roberta

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we could remove this strange variable

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few tests are missing with example of MaskedLM, XmodForCausalLM etc. This could make sure that they also work. Very simple testing is enough, but would be great if all the pipelines work with the different variations of the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file already contains tests for XmodForMaskedLM, XmodForCausalLM etc.
The test coverage is the same as with other XLM-based models, e.g. xlm_roberta_xl.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no! If they were not in the previous codes, then we are lucky that it works!
What I am suggesting here is just adding simple tests to check that model.generate() has the expected behaviour with respect to the original code (so as part of the integration tests).

You don't have to use the original codebase to compare the examples, but at least having a correct generation tests will prevent us from doing bad modification to the generate() function that will not be seen by current tests! This is valide for various model 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying.
I have now added an integration test for FillMaskPipeline.

I hesitate to check the output of other pipelines because there are no trained models for those pipelines. Checking the output of randomly initialized models amounts to a guarantee that different versions of transformers perform identical initialization. Such a guarantee would be out of the scope of this PR and should be discussed in a separate issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is perfect. If there are not pretrained model, it is not needed to test the pipelines !

@ArthurZucker
Copy link
Collaborator

Can you also add the model to the documentation_tests.txt file to and run the doctests to be sure that they are valid?

@jvamvas
Copy link
Contributor Author

jvamvas commented Jan 31, 2023

@ArthurZucker Thanks for the code review. I have now implemented the changes you requested.

I agree that the models should be moved to the facebook organization but do not have the permissions to do so.

@ArthurZucker
Copy link
Collaborator

About moving the weights, I think I am in the org, and can help with that / ask to add you to transfer them 😉
Looks very good, almost there! 🚀

Comment on lines 610 to 620
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is a pipeline test, let's put it in the test_pipeline_fill_mask file.
Its cool that you added it. I was mentioning something more simpler with juste model.generate()

@jvamvas
Copy link
Contributor Author

jvamvas commented Feb 6, 2023

Hi @ArthurZucker, thanks for pointing out that there are missing tests in this PR.
Unfortunately, I have not been able to figure out which tests are missing, exactly.

As of now, there are the following tests:

  • tests.models.xmod.test_modeling_xmod.XmodModelTest – checks that there are no errors when calling the methods of XmodFor..., including model.generate()
  • tests.models.xmod.test_modeling_xmod.XmodModelIntegrationTest – checks that the output of the pre-trained models jvamvas/xmod-base and jvamvas/xmod-large-prenorm is identical to the corresponding Fairseq models.

Could you please clarify which tests need to be added still?

@ArthurZucker
Copy link
Collaborator

Hey! Thanks for bearing with me.

  • What is there but should not: a pipeline test inside the test_modeling file
  • The missing tests :
    Something like what we have in opt , which will be part of the tests.models.xmod.test_modeling_xmod.XmodModelIntegrationTest. You can also have a class XmodGenerationTest(unittest.TestCase):
    A sample test is the following.
    def test_batch_generation(self):
        model_id = "facebook/opt-350m"

        tokenizer = GPT2Tokenizer.from_pretrained(model_id)
        model = OPTForCausalLM.from_pretrained(model_id)
        model.to(torch_device)

        tokenizer.padding_side = "left"

        # use different length sentences to test batching
        sentences = [
            "Hello, my dog is a little",
            "Today, I",
        ]

        inputs = tokenizer(sentences, return_tensors="pt", padding=True)
        input_ids = inputs["input_ids"].to(torch_device)

        outputs = model.generate(
            input_ids=input_ids,
            attention_mask=inputs["attention_mask"].to(torch_device),
        )

        inputs_non_padded = tokenizer(sentences[0], return_tensors="pt").input_ids.to(torch_device)
        output_non_padded = model.generate(input_ids=inputs_non_padded)

        num_paddings = inputs_non_padded.shape[-1] - inputs["attention_mask"][-1].long().sum().cpu().item()
        inputs_padded = tokenizer(sentences[1], return_tensors="pt").input_ids.to(torch_device)
        output_padded = model.generate(input_ids=inputs_padded, max_length=model.config.max_length - num_paddings)

        batch_out_sentence = tokenizer.batch_decode(outputs, skip_special_tokens=True)
        non_padded_sentence = tokenizer.decode(output_non_padded[0], skip_special_tokens=True)
        padded_sentence = tokenizer.decode(output_padded[0], skip_special_tokens=True)

        expected_output_sentence = [
            "Hello, my dog is a little bit of a dork.\nI'm a little bit",
            "Today, I was in the middle of a conversation with a friend about the",
        ]
        self.assertListEqual(expected_output_sentence, batch_out_sentence)
        self.assertListEqual(batch_out_sentence, [non_padded_sentence, padded_sentence])

Does that make sense? 😉

@ArthurZucker
Copy link
Collaborator

The CI tests are broken but it is not your fault ! We are going to have to wait until the basic docker properly runs, but the added test looks good 😉

@younesbelkada
Copy link
Contributor

younesbelkada commented Feb 9, 2023

hi @jvamvas !
For the code quality tests just need to rebase with main and run:

pip install --upgrade -e .["quality"]

Then run the usual make style or make fixup

@jvamvas
Copy link
Contributor Author

jvamvas commented Feb 9, 2023

@younesbelkada Sorry about the bad rebase. On the plus side, the tests are now passing again 🎉

@ArthurZucker
Copy link
Collaborator

ArthurZucker commented Feb 9, 2023

Yeah hahah. Do you think you can reset, then rebase instead of merge? 😉

@jvamvas
Copy link
Contributor Author

jvamvas commented Feb 10, 2023

@ArthurZucker Done. The failing test is not related to this PR

@ArthurZucker ArthurZucker merged commit b0d539c into huggingface:main Feb 10, 2023
@ArthurZucker
Copy link
Collaborator

Great work! Thanks for working on this model! 🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants