Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make model_class required arg, default processor_class to AutoProcessor #1077

Merged
merged 9 commits into from
Aug 13, 2024

Conversation

parkervg
Copy link
Contributor

Closes #1076

@rlouf
Copy link
Member

rlouf commented Jul 31, 2024

Thank you! Would you mind running pre-commit locally to fix the code style?

Copy link
Contributor

@lapp0 lapp0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please verify that pytest -s tests/generate/test_generate.py -k "model_transformers_vision" works on a GPU machine?

Thanks so much for taking care of this, we need people who make little fixes that have a big impact!

@parkervg
Copy link
Contributor Author

parkervg commented Jul 31, 2024

@lapp0 Yes that's a good catch! I edited model_transformers_vision in test_generate.py to pass the model_class arg. Will try to verify the tests work with a GPU later.

Also, to have a CPU-based test that runs from GitHub actions, I added test_integration_transformers_vision.py which uses the trl-internal-testing/tiny-random-LlavaForConditionalGeneration model. Currently just testing outlines.generate.text, but can add more later.

(thinking out loud) since TransformersVision is just subclassing Transformers, we could also just add some @pytest.mark.parametrize logic to test_integration_transformers.py

@lapp0
Copy link
Contributor

lapp0 commented Jul 31, 2024

@lapp0 Yes that's a good catch! I edited model_transformers_vision in test_generate.py to pass the model_class arg. Will try to verify the tests work with a GPU later.

This models tests are disabled unless GPU is available simply because they're too big for CI, not due to a lack of CPU-run capabilities.

You can run them on CPU if you remove "model_transformers_vision" from pytest_collection_modifyitems() within tests/generate/conftest.py. If you have a tiny LlavaNext variant to replace "llava-hf/llava-v1.6-mistral-7b-hf" that's great!

Currently just testing outlines.generate.text, but can add more later.

If you simply add the models fixture to ALL_MODEL_FIXTURES within test_generate.py it will be tested against every generate method available.

(thinking out loud) since TransformersVision is just subclassing Transformers, we could also just add some @pytest.mark.parametrize logic to test_integration_transformers.py

I think we should move away from test_integration_transformers.py unless there's a specific bug or feature we're testing the integration for. test_generate.py already tests the cross of every sampler X every model X every generation method.

@parkervg
Copy link
Contributor Author

parkervg commented Aug 1, 2024

Got it!

I think we should move away from test_integration_transformers.py unless there's a specific bug or feature we're testing the integration for.

Since ALL_MODEL_FIXTURES tests models in the pure-text setting (i.e. not passing any images), it seems useful given the current test directory setup to keep test_integration_transformers_vision.py intact, and use it as a place to specifically test text-image settings. I've added some tests to that end, ensuring things like throwing errors when image tags + passed media objects are incompatible lengths.

In writing the test cases, I realized I got into a flow of duplicate instantiations of my AutoProcessor object - I used one outside of the outlines call to format the prompt (processor.apply_chat_template), and internally outlines was using it to prepare the tensors for modeling (processor(prompts, ...). To try and give users the option to streamline that flow, I added an apply_chat_template arg that defaults to false, used like below:

conversation = [
    {
        "role": "user",
        "content": [{"type": "text", "text": "What is this?"}, {"type": "image"}],
    },
]
generator = outlines.generate.text(model)
sequence = generator(
    conversation,
    [image],
    apply_chat_template=True,
)

Now, given prompts is a valid huggingface conversation object, outlines will use the processor class it's already initialized to pre-process my object to a string.

@lapp0
Copy link
Contributor

lapp0 commented Aug 1, 2024

Since ALL_MODEL_FIXTURES tests models in the pure-text setting (i.e. not passing any images), it seems useful given the current test directory setup to keep test_integration_transformers_vision.py intact, and use it as a place to specifically test text-image settings.

I don't see any problem with additional testing at all. However you can also test *_vision model fixtures in test_generate.py. The inputs for *_vision models are handled by get_inputs(), and images will be passed to these models.

In writing the test cases, I realized I got into a flow of duplicate instantiations of my AutoProcessor object - I used one outside of the outlines call to format the prompt (processor.apply_chat_template), and internally outlines was using it to prepare the tensors for modeling (processor(prompts, ...). To try and give users the option to streamline that flow, I added an apply_chat_template arg that defaults to false, used like below:

I think we should leave this change out, as there are already two PRs working on this:

@parkervg
Copy link
Contributor Author

parkervg commented Aug 1, 2024

My bad for missing those 2 existing PRs!

Reverted that change, will stop now to keep this small PR actually small 😊

I am seeing the below error pop up a bunch when I run test_generate.py locally with a small LlavaForConditionalGeneration model, but it looks like all 13 errors are replicable in the main branch too. Don't currently have time to debug that further, but dropping the error below for the future:

images = [[<PIL.Image.Image image mode=RGB size=210x140 at 0x7FAE23D7B670>], [<PIL.Image.Image image mode=RGB size=210x140 at 0...age image mode=RGB size=210x140 at 0x7FAE23D7B670>], [<PIL.Image.Image image mode=RGB size=210x140 at 0x7FAE23D7B670>]]
expected_ndims = 3

    def make_list_of_images(images, expected_ndims: int = 3) -> List[ImageInput]:
        """
        Ensure that the input is a list of images. If the input is a single image, it is converted to a list of length 1.
        If the input is a batch of images, it is converted to a list of images.

        Args:
            images (`ImageInput`):
                Image of images to turn into a list of images.
            expected_ndims (`int`, *optional*, defaults to 3):
                Expected number of dimensions for a single input image. If the input image has a different number of
                dimensions, an error is raised.
        """
        if is_batched(images):
            return images

        # Either the input is a single image, in which case we create a list of length 1
        if isinstance(images, PIL.Image.Image):
            # PIL images are never batched
            return [images]

        if is_valid_image(images):
            if images.ndim == expected_ndims + 1:
                # Batch of images
                images = list(images)
            elif images.ndim == expected_ndims:
                # Single image
                images = [images]
            else:
                raise ValueError(
                    f"Invalid image shape. Expected either {expected_ndims + 1} or {expected_ndims} dimensions, but got"
                    f" {images.ndim} dimensions."
                )
            return images
>       raise ValueError(
            "Invalid image type. Expected either PIL.Image.Image, numpy.ndarray, torch.Tensor, tf.Tensor or "
            f"jax.ndarray, but got {type(images)}."
        )
E       ValueError: Invalid image type. Expected either PIL.Image.Image, numpy.ndarray, torch.Tensor, tf.Tensor or jax.ndarray, but got <class 'list'>.

../../../opt/miniconda3/envs/outlines-dev/lib/python3.10/site-packages/transformers/image_utils.py:205: ValueError

@lapp0
Copy link
Contributor

lapp0 commented Aug 1, 2024

Thanks! No need to address in this PR if the errors exist on main. Could you open an issue with reproduction steps though?

Also please let me know when this PR is ready for review.

@parkervg
Copy link
Contributor Author

parkervg commented Aug 1, 2024

@lapp0 ready for review!

Copy link
Contributor

@lapp0 lapp0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delayed review. Thanks for the changes. Looks good to me!

@rlouf rlouf merged commit c2d92b7 into dottxt-ai:main Aug 13, 2024
7 checks passed
@rlouf
Copy link
Member

rlouf commented Aug 13, 2024

Thank you for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

transformers_vision: Default to AutoProcessor, make model_class required argument
3 participants