-
Notifications
You must be signed in to change notification settings - Fork 33.6k
[Model] Add PP-Chart2Table Model Support #43767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 64 commits
Commits
Show all changes
68 commits
Select commit
Hold shift + click to select a range
8aa566b
init
XingweiDeng 1a5908d
fix doc
XingweiDeng c51b1c6
update
XingweiDeng 5e3f1d3
update
XingweiDeng 2c064cc
update
XingweiDeng 81514c0
update
XingweiDeng fc7c75f
update
XingweiDeng 01f2b29
update
XingweiDeng d8cc881
update
XingweiDeng 117a6cb
update
XingweiDeng 053f59e
Merge remote-tracking branch 'official/main' into feat/pp_chart2table
XingweiDeng db1e9a8
refactor image_processor_fast
XingweiDeng d8763e5
update
XingweiDeng 3d8a654
update
XingweiDeng 4abb70d
update
XingweiDeng 1efd48b
update
XingweiDeng 779cbca
update
XingweiDeng d61079a
update
XingweiDeng 618c63c
update
XingweiDeng 1079052
update
XingweiDeng 974d3b1
update
XingweiDeng 3f01494
update
XingweiDeng 3b91e2d
update
XingweiDeng 587652c
merge transformers main
XingweiDeng 65b7d01
update
XingweiDeng cc85b83
update
XingweiDeng b419732
update
XingweiDeng cc8bbca
update
XingweiDeng 9094eb5
update
XingweiDeng 55664d1
update
XingweiDeng 6bb4dbc
upddate
XingweiDeng f79d83b
update
XingweiDeng d050fe6
update
XingweiDeng bae2c96
update
XingweiDeng 8e4062b
update
XingweiDeng ac2bc66
update
XingweiDeng 636e759
Merge official main
XingweiDeng 45907f9
update
XingweiDeng d0bf04f
update
XingweiDeng 0f7ed31
update
XingweiDeng beea17f
update
XingweiDeng 6915583
update
XingweiDeng 6fe075b
update
XingweiDeng ff8f634
Merge remote-tracking branch 'official/main' into feat/pp_chart2table
XingweiDeng 86e9ec5
update
XingweiDeng 6d791e7
small fixes
vasqu 8e201c1
more explicit skip msg
vasqu 787e650
some quick fixes
vasqu da92f90
Merge branch 'main' into feat/pp_chart2table
vasqu 7280cf5
fix
vasqu 957220a
quick cleanups
vasqu 9f74fa1
update
XingweiDeng bcaef3d
merge
XingweiDeng f33cfb5
update
XingweiDeng 8394c08
update
XingweiDeng 3c0b028
update
XingweiDeng b50607c
update
XingweiDeng 44529f7
update
XingweiDeng ca92529
merge
XingweiDeng ba238c8
update
XingweiDeng e7401f0
fixup after new refactor
vasqu 28653c3
fix
vasqu d7d8ee8
update
XingweiDeng d71e07b
update
XingweiDeng c095f11
last fixups
vasqu eb5c2a5
update
XingweiDeng 081537c
Merge branch 'main' into feat/pp_chart2table
vasqu bcccd9d
remove my todos I left there
vasqu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,204 @@ | ||
| <!--Copyright 2026 The HuggingFace Team. All rights reserved. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
| the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
| specific language governing permissions and limitations under the License. | ||
|
|
||
| ⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | ||
| rendered properly in your Markdown viewer. | ||
|
|
||
| --> | ||
| *This model was released on 2025-05-20 and added to Hugging Face Transformers on 2026-03-18.* | ||
|
|
||
| # PP-Chart2Table | ||
|
|
||
| <div class="flex flex-wrap space-x-1"> | ||
| <img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white"> | ||
| </div> | ||
|
|
||
| ## Overview | ||
|
|
||
| **PP-Chart2Table** is a SOTA multimodal model developed by the PaddlePaddle team, specializing in chart parsing for both Chinese and English. Its high performance is driven by a novel "Shuffled Chart Data Retrieval" training task, which, combined with a refined token masking strategy, significantly improves its efficiency in converting charts to data tables. The model is further strengthened by an advanced data synthesis pipeline that uses high-quality seed data, RAG, and LLMs persona design to create a richer, more diverse training set. To address the challenge of large-scale unlabeled, out-of-distribution (OOD) data, the team implemented a two-stage distillation process, ensuring robust adaptability and generalization on real-world data. | ||
|
|
||
| ## Model Architecture | ||
| PP-Chart2Table adopts a multimodal fusion architecture that combines a vision tower for chart feature extraction and a language model for table structure generation, enabling end-to-end chart-to-table conversion. | ||
|
|
||
|
|
||
| ## Usage | ||
|
|
||
| ### Single input inference | ||
|
|
||
| The example below demonstrates how to classify image with PP-Chart2Table using [`Pipeline`] or the [`AutoModel`]. | ||
|
|
||
| <hfoptions id="usage"> | ||
| <hfoption id="Pipeline"> | ||
|
|
||
| ```py | ||
| from transformers import pipeline | ||
|
|
||
| pipe = pipeline("image-text-to-text", model="PaddlePaddle/PP-Chart2Table_safetensors") | ||
|
|
||
| # PPChart2TableProcessor uses hardcoded "Chart to table" instruction internally via chat template | ||
| conversation = [ | ||
| { | ||
| "role": "user", | ||
| "content": [ | ||
| { | ||
| "type": "image", | ||
| "url": "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/chart_parsing_02.png", | ||
| }, | ||
| ], | ||
| }, | ||
| ] | ||
| result = pipe(text=conversation) | ||
| print(result[0]["generated_text"]) | ||
|
|
||
| ``` | ||
|
|
||
| </hfoption> | ||
|
|
||
| <hfoption id="AutoModel"> | ||
|
|
||
| ```py | ||
| import requests | ||
| from PIL import Image | ||
| from transformers import AutoModelForImageTextToText, AutoProcessor | ||
|
|
||
| model_path = "PaddlePaddle/PP-Chart2Table_safetensors" | ||
| model = AutoModelForImageTextToText.from_pretrained( | ||
| model_path, | ||
| device_map="auto", | ||
| ) | ||
| processor = AutoProcessor.from_pretrained(model_path) | ||
|
|
||
| # PPChart2TableProcessor uses hardcoded "Chart to table" instruction internally via chat template | ||
| conversation = [ | ||
| { | ||
| "role": "user", | ||
| "content": [ | ||
| { | ||
| "type": "image", | ||
| "url": "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/chart_parsing_02.png", | ||
| }, | ||
| ], | ||
| }, | ||
| ] | ||
|
|
||
| inputs = processor.apply_chat_template( | ||
| conversation, | ||
| tokenize=True, | ||
| add_generation_prompt=True, | ||
| truncation=True, | ||
| return_dict=True, | ||
| return_tensors="pt", | ||
| ).to(model.device) | ||
|
|
||
| generated_ids = model.generate(**inputs, do_sample=False, max_new_tokens=256) | ||
| generated_ids_trimmed = [out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)] | ||
| result = processor.batch_decode(generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False) | ||
| print(result) | ||
|
|
||
| ``` | ||
|
|
||
| </hfoption> | ||
| </hfoptions> | ||
|
|
||
| ### Batched inference | ||
|
|
||
| Here is how you can do it with PP-Chart2Table using [`Pipeline`] or the [`AutoModel`]: | ||
|
|
||
| <hfoptions id="usage"> | ||
| <hfoption id="Pipeline"> | ||
|
|
||
| ```py | ||
| from transformers import pipeline | ||
|
|
||
| pipe = pipeline("image-text-to-text", model="PaddlePaddle/PP-Chart2Table_safetensors") | ||
|
|
||
| # PPChart2TableProcessor uses hardcoded "Chart to table" instruction internally via chat template | ||
| conversation = [ | ||
| { | ||
| "role": "user", | ||
| "content": [ | ||
| { | ||
| "type": "image", | ||
| "url": "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/chart_parsing_02.png", | ||
| }, | ||
| ], | ||
| }, | ||
| ] | ||
| result = pipe(text=[conversation, conversation]) | ||
| print(result[0][0]["generated_text"]) | ||
|
|
||
| ``` | ||
|
|
||
| </hfoption> | ||
|
|
||
| <hfoption id="AutoModel"> | ||
|
|
||
| ```py | ||
| import requests | ||
| from PIL import Image | ||
| from transformers import AutoModelForImageTextToText, AutoProcessor | ||
|
|
||
| model_path = "PaddlePaddle/PP-Chart2Table_safetensors" | ||
| model = AutoModelForImageTextToText.from_pretrained( | ||
| model_path, | ||
| device_map="auto", | ||
| ) | ||
| processor = AutoProcessor.from_pretrained(model_path) | ||
|
|
||
| # PPChart2TableProcessor uses hardcoded "Chart to table" instruction internally via chat template | ||
| conversation = [ | ||
| { | ||
| "role": "user", | ||
| "content": [ | ||
| { | ||
| "type": "image", | ||
| "url": "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/chart_parsing_02.png", | ||
| }, | ||
| ], | ||
| }, | ||
| ] | ||
|
|
||
| batch_conversation = [conversation, conversation] | ||
| inputs = processor.apply_chat_template( | ||
| batch_conversation, | ||
| tokenize=True, | ||
| add_generation_prompt=True, | ||
| truncation=True, | ||
| return_dict=True, | ||
| return_tensors="pt", | ||
| ).to(model.device) | ||
|
|
||
| generated_ids = model.generate(**inputs, do_sample=False, max_new_tokens=256) | ||
| generated_ids_trimmed = [out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)] | ||
| result = processor.batch_decode(generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False) | ||
| print(result) | ||
|
|
||
| ``` | ||
|
|
||
| </hfoption> | ||
| </hfoptions> | ||
|
|
||
|
|
||
| ## PPChart2TableConfig | ||
|
|
||
| [[autodoc]] PPChart2TableConfig | ||
|
|
||
| ## PPChart2TableImageProcessor | ||
|
|
||
| [[autodoc]] PPChart2TableImageProcessor | ||
|
|
||
| ## PPChart2TableImageProcessorPil | ||
|
|
||
| [[autodoc]] PPChart2TableImageProcessorPil | ||
|
|
||
| ## PPChart2TableProcessor | ||
|
|
||
| [[autodoc]] PPChart2TableProcessor |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| # Copyright 2026 The HuggingFace Team. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| from typing import TYPE_CHECKING | ||
|
|
||
| from ...utils import _LazyModule | ||
| from ...utils.import_utils import define_import_structure | ||
|
|
||
|
|
||
| if TYPE_CHECKING: | ||
| from .configuration_pp_chart2table import * | ||
| from .image_processing_pp_chart2table_fast import * | ||
| from .processing_pp_chart2table import * | ||
|
vasqu marked this conversation as resolved.
|
||
| else: | ||
| import sys | ||
|
|
||
| _file = globals()["__file__"] | ||
| sys.modules[__name__] = _LazyModule(__name__, _file, define_import_structure(_file), module_spec=__spec__) | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.