[Model] Add PP-Chart2Table Model Support#43767
Conversation
|
Hi @molbap @vasqu @yonigozlan , PP-Chart2Table is a model contributed by our team. Could you please take a look and review it when you have a moment? Thanks a lot! |
|
@vasqu This is PP-Chart2Table, one of the five models we plan to merge this week. PTAL.🤗 |
vasqu
left a comment
There was a problem hiding this comment.
Super solid! I only have a few smaller comments to align with a few smaller standards but overall nice job
| model_path = "PaddlePaddle/PP-Chart2Table_safetensors" | ||
| model = AutoModelForImageTextToText.from_pretrained( | ||
| model_path, | ||
| dtype="float32", |
There was a problem hiding this comment.
| dtype="float32", |
| dtype="float32", | ||
| device_map="auto", | ||
| ) | ||
| processor = AutoProcessor.from_pretrained(model_path, use_fast=True).to(model.device) |
There was a problem hiding this comment.
| processor = AutoProcessor.from_pretrained(model_path, use_fast=True).to(model.device) | |
| processor = AutoProcessor.from_pretrained(model_path).to(model.device) |
not sure but I don't think we need this
| image = Image.open(requests.get("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/chart_parsing_02.png", stream=True).raw) | ||
| inputs = processor(images=image) | ||
|
|
||
| generated_ids = model.generate(**inputs, use_cache=True, do_sample=False, max_new_tokens=256) |
There was a problem hiding this comment.
| generated_ids = model.generate(**inputs, use_cache=True, do_sample=False, max_new_tokens=256) | |
| generated_ids = model.generate(**inputs, do_sample=False, max_new_tokens=256) |
same here, shouldn't be needed I think
| model_path = "PaddlePaddle/PP-Chart2Table_safetensors" | ||
| model = AutoModelForImageTextToText.from_pretrained( | ||
| model_path, | ||
| dtype="float32", |
There was a problem hiding this comment.
| dtype="float32", |
| ## PPChart2TableVisionPreTrainedModel | ||
|
|
||
| [[autodoc]] PPChart2TableVisionPreTrainedModel | ||
|
|
||
| ## PPChart2TablePreTrainedModel | ||
|
|
||
| [[autodoc]] PPChart2TablePreTrainedModel |
There was a problem hiding this comment.
| ## PPChart2TableVisionPreTrainedModel | |
| [[autodoc]] PPChart2TableVisionPreTrainedModel | |
| ## PPChart2TablePreTrainedModel | |
| [[autodoc]] PPChart2TablePreTrainedModel |
Nit: don't really need those exposed in the docs
| @unittest.skip(reason="PPChart2Table does not support this test.") | ||
| def test_model_is_small(self): | ||
| pass |
There was a problem hiding this comment.
Definitely shouldn't be skipped, let's shrink the model to make this pass. It's fairly important for our CI to run fast
| @unittest.skip( | ||
| reason="PPChart2Table have reused the GotOcr2 model, which does not implement the latest logic for capturing attentions and hidden_states introduced in Transformers v5." | ||
| ) | ||
| def test_get_image_features_attentions(self): | ||
| pass | ||
|
|
||
| @unittest.skip( | ||
| reason="PPChart2Table have reused the GotOcr2 model, which does not implement the latest logic for capturing attentions and hidden_states introduced in Transformers v5." | ||
| ) | ||
| def test_get_image_features_hidden_states(self): | ||
| pass |
There was a problem hiding this comment.
They are not skipped in got ocr2 so imo we should check what goes wrong
| **inputs, | ||
| use_cache=True, | ||
| do_sample=False, | ||
| max_new_tokens=1024, |
There was a problem hiding this comment.
I see that we properly end early but we probably should reduce this nonetheless
| max_new_tokens=1024, | |
| max_new_tokens=32, |
just a wild guess
| "BambaConfig": ["attn_layer_indices"], | ||
| "Dots1Config": ["max_window_layers"], | ||
| "JambaConfig": ["attn_layer_offset", "attn_layer_period", "expert_layer_offset", "expert_layer_period"], | ||
| "PPChart2TableConfig": ["tie_word_embeddings"], |
There was a problem hiding this comment.
| "PPChart2TableConfig": ["tie_word_embeddings"], |
shouldn't be needed, we have excluded some common attributes with ATTRIBUTES_TO_ALLOW in the file
| if images is None: | ||
| raise ValueError("At least one of `images` must be provided") | ||
| image_inputs = self.image_processor(images=images, **output_kwargs["images_kwargs"]) | ||
|
|
||
| # Prepare input ids for batch | ||
| if text is None: | ||
| raise ValueError("At least one of `text` must be provided") | ||
|
|
||
| if not isinstance(text, list): | ||
| text = [text] | ||
|
|
||
| input_ids = self.tokenizer(text, **output_kwargs["text_kwargs"]).input_ids | ||
|
|
There was a problem hiding this comment.
nit: except for errors, looks same as super().__call__. We can do
if text is None or images is None:
raise ValueError("Both `images` and `text` must be provided")
return super().__call(images=images, text=text, **kwargs)
|
#43514 just got merged, we need to fixup the img processor a bit, let me know if I should step in |
yonigozlan
left a comment
There was a problem hiding this comment.
Cc @vasqu changes needed after image proc refactor ;)
|
run-slow: pp_chart2table |
|
This comment contains models: ["models/pp_chart2table"] |
CI ResultsCommit Info
Model CI Report❌ 3 new failed tests from this PR 😭
|
|
run-slow: pp_chart2table |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, pp_chart2table |
|
This comment contains models: ["models/pp_chart2table"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
What does this PR do?
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.