Conversation
|
This PR is ready. One thing to note is that image generation quality is very random, and even with the CFG we have it is not the best. The original repo is neither as good as the latest image generation models, and they have a slighly different CFG for instruct-based models I can take a look at trying to match at least the original repo quality, but seems like Anole is not top model anymore. Their advantage was in having interleaved generation possible, I believe Janus also support it now given that it is one model doing both modalities. So now I am not sure if it is still worth shipping Anole or not? @ArthurZucker WDYT? |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Up to you, but if janus is easier to add we should prob add it rather than this one! Unless logic is the same! |
|
Yeah, same feeling here, isn't worth maintaining but this might help @yaswanth19 with Janus shipping |
|
Okay I'll review when I have time then! |
| return hidden_states | ||
|
|
||
|
|
||
| class AnoleVQVAEEncoderResnetBlock(nn.Module): |
There was a problem hiding this comment.
@zucchini-nlp Am I missing something or is AnoleVQVAEEncoderResnetBlock similar to AnoleVQVAEResnetBlock I can see it is inherited from ChameleonVQVAEEncoderResnetBlock so while unravelling a duplicate block is created because we are not overwriting the Encoder part withAnoleVQVAEResnetBlock 🤔
There was a problem hiding this comment.
Yep, completely identical modules. I just wanted didn't want to use a module with encoder prefix while decoding. So in the modular I added a general Resnet inherited from EncoderResnet
| # compute in_ch_mult, block_in and curr_res at lowest res | ||
| block_in = base_channels * config.channel_multiplier[self.num_resolutions - 1] | ||
| curr_res = resolution // 2 ** (self.num_resolutions - 1) | ||
| self.z_shape = (1, latent_channels, curr_res, curr_res) |
There was a problem hiding this comment.
Just a query ? What is the use of curr_res and also we are not using self.z_shape anywhere else.
There was a problem hiding this comment.
nope, might have forgotten to remove after a small refactor
What does this PR do?
Adds Anole as a new model, a new PR based on #32013