-
Notifications
You must be signed in to change notification settings - Fork 33.4k
Update tiny model creation script and some others files #22006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
7cb42a3
04c1283
f03e585
54060f2
529d763
300c924
0ac5760
669a7c6
718ee83
63726ab
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,8 +31,7 @@ class GPTSanJapaneseConfig(PretrainedConfig): | |
| This is the configuration class to store the configuration of a [`GPTSanJapaneseModel`]. It is used to instantiate | ||
| a GPTSANJapanese model according to the specified arguments, defining the model architecture. Instantiating a | ||
| configuration with the defaults will yield a similar configuration to that of the GPTSANJapanese | ||
| [tanreinama/GPTSAN-2.8B-spout_is_uniform](https://huggingface.co/tanreinama/GPTSAN-2.8B-spout_is_uniform) | ||
| architecture. | ||
| [Tanrei/GPTSAN-japanese](https://huggingface.co/Tanrei/GPTSAN-japanese) architecture. | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Previous one was wrong name/link. |
||
|
|
||
| Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the | ||
| documentation from [`PretrainedConfig`] for more information. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -30,7 +30,8 @@ class TimesformerConfig(PretrainedConfig): | |
| This is the configuration class to store the configuration of a [`TimesformerModel`]. It is used to instantiate a | ||
| TimeSformer model according to the specified arguments, defining the model architecture. Instantiating a | ||
| configuration with the defaults will yield a similar configuration to that of the TimeSformer | ||
| [facebook/timesformer](https://huggingface.co/facebook/timesformer-base-finetuned-k600) architecture. | ||
| [facebook/timesformer-base-finetuned-k600](https://huggingface.co/facebook/timesformer-base-finetuned-k600) | ||
| architecture. | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Previous one was wrong name/link. |
||
|
|
||
| Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the | ||
| documentation from [`PretrainedConfig`] for more information. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -30,7 +30,7 @@ class TvltConfig(PretrainedConfig): | |
| This is the configuration class to store the configuration of a [`TvltModel`]. It is used to instantiate a TVLT | ||
| model according to the specified arguments, defining the model architecture. Instantiating a configuration with the | ||
| defaults will yield a similar configuration to that of the TVLT | ||
| [TVLT/tvlt-base](https://huggingface.co/ZinengTang/tvlt-base) architecture. | ||
| [ZinengTang/tvlt-base](https://huggingface.co/ZinengTang/tvlt-base) architecture. | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Previous one was wrong name/link. |
||
|
|
||
| Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the | ||
| documentation from [`PretrainedConfig`] for more information. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -41,8 +41,8 @@ class XmodConfig(PretrainedConfig): | |
| r""" | ||
| This is the configuration class to store the configuration of a [`XmodModel`]. It is used to instantiate an X-MOD | ||
| model according to the specified arguments, defining the model architecture. Instantiating a configuration with the | ||
| defaults will yield a similar configuration to that of the [xmod-base](https://huggingface.co/facebook/xmod-base) | ||
| architecture. | ||
| defaults will yield a similar configuration to that of the | ||
| [facebook/xmod-base](https://huggingface.co/facebook/xmod-base) architecture. | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Previous one was wrong name/link. |
||
|
|
||
| Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the | ||
| documentation from [`PretrainedConfig`] for more information. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -56,6 +56,7 @@ def __init__( | |
| parent, | ||
| batch_size=2, | ||
| is_training=True, | ||
| vocab_size=99, | ||
| use_auxiliary_loss=False, | ||
| num_queries=10, | ||
| num_channels=3, | ||
|
|
@@ -69,6 +70,7 @@ def __init__( | |
| self.parent = parent | ||
| self.batch_size = batch_size | ||
| self.is_training = is_training | ||
| self.vocab_size = vocab_size | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Need to accept
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add |
||
| self.use_auxiliary_loss = use_auxiliary_loss | ||
| self.num_queries = num_queries | ||
| self.num_channels = num_channels | ||
|
|
@@ -84,12 +86,16 @@ def prepare_config_and_inputs(self): | |
| torch_device | ||
| ) | ||
|
|
||
| task_inputs = torch.randint(high=49408, size=(self.batch_size, self.sequence_length)).to(torch_device).long() | ||
| task_inputs = ( | ||
| torch.randint(high=self.vocab_size, size=(self.batch_size, self.sequence_length)).to(torch_device).long() | ||
| ) | ||
|
|
||
| pixel_mask = torch.ones([self.batch_size, self.min_size, self.max_size], device=torch_device) | ||
|
|
||
| text_inputs = ( | ||
| torch.randint(high=49408, size=(self.batch_size, self.num_queries - self.n_ctx, self.sequence_length)) | ||
| torch.randint( | ||
| high=self.vocab_size, size=(self.batch_size, self.num_queries - self.n_ctx, self.sequence_length) | ||
| ) | ||
| .to(torch_device) | ||
| .long() | ||
| ) | ||
|
|
@@ -104,6 +110,7 @@ def prepare_config_and_inputs(self): | |
|
|
||
| def get_config(self): | ||
| config = OneFormerConfig( | ||
| text_encoder_vocab_size=self.vocab_size, | ||
| hidden_size=self.hidden_dim, | ||
| ) | ||
|
|
||
|
|
@@ -303,8 +310,10 @@ def test_model_with_labels(self): | |
| size = (self.model_tester.min_size,) * 2 | ||
| inputs = { | ||
| "pixel_values": torch.randn((2, 3, *size), device=torch_device), | ||
| "task_inputs": torch.randint(high=49408, size=(2, 77), device=torch_device).long(), | ||
| "text_inputs": torch.randint(high=49408, size=(2, 134, 77), device=torch_device).long(), | ||
| "task_inputs": torch.randint(high=self.model_tester.vocab_size, size=(2, 77), device=torch_device).long(), | ||
| "text_inputs": torch.randint( | ||
| high=self.model_tester.vocab_size, size=(2, 134, 77), device=torch_device | ||
| ).long(), | ||
| "mask_labels": torch.randn((2, 150, *size), device=torch_device), | ||
| "class_labels": torch.zeros(2, 150, device=torch_device).long(), | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -103,6 +103,7 @@ def __init__( | |
| batch_size=13, | ||
| seq_length=7, | ||
| is_training=False, | ||
| vocab_size=81, | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same above (for |
||
| hidden_size=24, | ||
| num_hidden_layers=4, | ||
| num_attention_heads=2, | ||
|
|
@@ -112,6 +113,7 @@ def __init__( | |
| self.batch_size = batch_size | ||
| self.seq_length = seq_length | ||
| self.is_training = is_training | ||
| self.vocab_size = vocab_size | ||
| self.hidden_size = hidden_size | ||
| self.num_hidden_layers = num_hidden_layers | ||
| self.num_attention_heads = num_attention_heads | ||
|
|
@@ -140,6 +142,7 @@ def prepare_config_and_inputs_for_common(self): | |
|
|
||
| def get_config(self): | ||
| return SpeechT5Config( | ||
| vocab_size=self.vocab_size, | ||
| hidden_size=self.hidden_size, | ||
| encoder_layers=self.num_hidden_layers, | ||
| decoder_layers=self.num_hidden_layers, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -51,10 +51,12 @@ def get_checkpoint_from_config_class(config_class): | |
| config_source = inspect.getsource(config_class) | ||
| checkpoints = _re_checkpoint.findall(config_source) | ||
|
|
||
| for checkpoint in checkpoints: | ||
| # Each `checkpoint` is a tuple of a checkpoint name and a checkpoint link. | ||
| # For example, `('bert-base-uncased', 'https://huggingface.co/bert-base-uncased')` | ||
| ckpt_name, ckpt_link = checkpoint | ||
| # Each `checkpoint` is a tuple of a checkpoint name and a checkpoint link. | ||
| # For example, `('bert-base-uncased', 'https://huggingface.co/bert-base-uncased')` | ||
| for ckpt_name, ckpt_link in checkpoints: | ||
| # allow the link to end with `/` | ||
| if ckpt_link.endswith("/"): | ||
| ckpt_link = ckpt_link[:-1] | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To allow link with ending https://huggingface.co/BridgeTower/bridgetower-base/
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, previously, if ckpt_link == ckpt_link_from_name:
checkpoint = ckpt_name
breakWe only want to return |
||
|
|
||
| # verify the checkpoint name corresponds to the checkpoint link | ||
| ckpt_link_from_name = f"https://huggingface.co/{ckpt_name}" | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to add the missing entries in mappings.