Skip to content

Anole add model#36047

Closed
zucchini-nlp wants to merge 14 commits intohuggingface:mainfrom
zucchini-nlp:anole-add-model
Closed

Anole add model#36047
zucchini-nlp wants to merge 14 commits intohuggingface:mainfrom
zucchini-nlp:anole-add-model

Conversation

@zucchini-nlp
Copy link
Member

What does this PR do?

Adds Anole as a new model, a new PR based on #32013

@zucchini-nlp
Copy link
Member Author

This PR is ready. One thing to note is that image generation quality is very random, and even with the CFG we have it is not the best. The original repo is neither as good as the latest image generation models, and they have a slighly different CFG for instruct-based models

I can take a look at trying to match at least the original repo quality, but seems like Anole is not top model anymore. Their advantage was in having interleaved generation possible, I believe Janus also support it now given that it is one model doing both modalities.

So now I am not sure if it is still worth shipping Anole or not? @ArthurZucker WDYT?

@zucchini-nlp zucchini-nlp changed the title [WIP] Anole add model Anole add model Feb 5, 2025
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ArthurZucker
Copy link
Collaborator

Up to you, but if janus is easier to add we should prob add it rather than this one! Unless logic is the same!
Sorry that you had to spend time on it 😾

@zucchini-nlp
Copy link
Member Author

Yeah, same feeling here, isn't worth maintaining but this might help @yaswanth19 with Janus shipping

@ArthurZucker
Copy link
Collaborator

Okay I'll review when I have time then!

return hidden_states


class AnoleVQVAEEncoderResnetBlock(nn.Module):
Copy link
Contributor

@yaswanth19 yaswanth19 Feb 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zucchini-nlp Am I missing something or is AnoleVQVAEEncoderResnetBlock similar to AnoleVQVAEResnetBlock I can see it is inherited from ChameleonVQVAEEncoderResnetBlock so while unravelling a duplicate block is created because we are not overwriting the Encoder part withAnoleVQVAEResnetBlock 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, completely identical modules. I just wanted didn't want to use a module with encoder prefix while decoding. So in the modular I added a general Resnet inherited from EncoderResnet

# compute in_ch_mult, block_in and curr_res at lowest res
block_in = base_channels * config.channel_multiplier[self.num_resolutions - 1]
curr_res = resolution // 2 ** (self.num_resolutions - 1)
self.z_shape = (1, latent_channels, curr_res, curr_res)
Copy link
Contributor

@yaswanth19 yaswanth19 Feb 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a query ? What is the use of curr_res and also we are not using self.z_shape anywhere else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, might have forgotten to remove after a small refactor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants