Add Image Completion Transformer (ICT)#21990
Conversation
2c80a02 to
4a6389c
Compare
amyeroberts
left a comment
There was a problem hiding this comment.
Looking good!
Left a first pass review with some general comments - mostly nits and formatting. Super exciting to get this model added :D
6cc832d to
791bce7
Compare
There was a problem hiding this comment.
Just to slightly nudge here 😅
There was a problem hiding this comment.
Am I missing a step (sampling?) between the output from the transfomer and Guided Upsampler?
https://github.com/raywzy/ICT/blob/59dd12d374d47cdf0dce90923017ca3657e6aa0b/Transformer/utils/util.py#L107-L114
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
4438921 to
7797658
Compare
amyeroberts
left a comment
There was a problem hiding this comment.
Thanks for the continued work adding this model!
I've done a first pass over the main things I spotted. Most files and structure looks good. Just make sure to remove any old comments / code that's left.
Main comment is about the structure of the modeling code. Quite a few pieces don't match the standard patterns of the library e.g. overridding __call__ instead of forward, or not canonical in its treatment of tensors e.g. doing for loops to iterate over a batch. Once this has been reworked I'll do another pass.
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
0379d01 to
970033e
Compare
|
|
||
| img = img.float() | ||
| img = F.interpolate(img, size=(target_height, target_width), mode="bicubic") | ||
|
|
There was a problem hiding this comment.
@amyeroberts This code used to convert from torch -> numpy -> PIL but it was refactored to just do the resizing with torch. But it seems that it's failing tests that test for determinism. My guess is that it's something to do with interpolate but couldn't figure it out.
There was a problem hiding this comment.
Could you specify which tests are failing?
Doing torch -> numpy -> PIL is quite funny within the model, especially when torch already can upsample and doing so will remove gradient information. Is that pattern from the original codebase?
Why do you suspect that it's coming from interpolate?
There was a problem hiding this comment.
Yup, that was the pattern from the original codebase (from here and here).
Here are the five failings tests separated by causes.
IctModelTest.test_determinismIctModelTest.test_model_outputs_equivalenceIctModelTest.test_save_load
Never mind about interpolate. I was debugging it and saw that appearance_prior was changing and realized that it was coming frompred = torch.multinomial(probs, num_samples=1)
But I set torch.manual_seed(3) in test_modeling_ict.py and it seems to be not working...?
IctModelTest.test_retain_grad_hidden_states_attentions
I'm getting the error below but I still haven't been able to figure out
self.assertIsNotNone(hidden_states.grad)
AssertionError: unexpectedly None
IctModelIntegrationTest.test_inference_masked_image_modeling
My guess is that I need to run the original model with the same image and see if the tensors match? If the original model also uses randomization within the code (not just the weights), how should I go about comparing these?
There was a problem hiding this comment.
Another test that failing is build / build_pr_documentation which is saying:
The docstring of IctModel.forward comports the following issue(s) and needs
fixing:
- The return block is empty.
which seems to be pointing at this part:
def forward(
self,
pixel_values: Optional[torch.Tensor],
bool_masked_pos: Optional[torch.BoolTensor] = None,
clusters: Optional[torch.Tensor] = None,
output_attentions: Optional[bool] = None,
output_hidden_states: Optional[bool] = None,
return_dict: Optional[bool] = None,
) -> Union[Tuple, MaskedImageModelingOutput]:
r"""
Returns:
Example:
```python
>>> import torch
>>> import numpy as np
>>> from PIL import Image
>>> import requestsBut I was looking at other forward methods of other models and it seems that Returns: are empty, so I left it as it is. But maybe I'm missing something.
There was a problem hiding this comment.
Also, this tells me that there's a style error somewhere but not telling me where exactly 😅
What does this PR do?
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.