Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't Fill PDFs without /DR dictionary #2670

Closed
BrooksWatson717 opened this issue May 22, 2024 · 2 comments · Fixed by #2677
Closed

Can't Fill PDFs without /DR dictionary #2670

BrooksWatson717 opened this issue May 22, 2024 · 2 comments · Fixed by #2677
Labels
workflow-forms From a users perspective, forms is the affected feature/workflow

Comments

@BrooksWatson717
Copy link

I am trying to fill out a PDF form, but running into an issue where the /Font dictionary is not populated, so nothing can be written to the PDF.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
macOS-14.4.1-arm64-arm-64bit


$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==4.2.0, crypt_provider=('pycryptodome', '3.20.0'), PIL=none

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfReader, PdfWriter
from pypdf.constants import AnnotationDictionaryAttributes

reader = PdfReader("modified_example.pdf")
writer = PdfWriter()
writer.append(reader)
fields = []
for page in reader.pages:
    writer.reattach_fields(page)
    for annot in page.annotations:
        annot = annot.get_object()
        if annot[AnnotationDictionaryAttributes.Subtype] == "/Widget":
            fields.append(annot)
            if annot['/FT'] == "/Tx":
                fieldName = annot["/T"]
                writer.update_page_form_field_values(
                    writer.pages[page.page_number],
                    {fieldName: "Brooks"},
                    auto_regenerate=False,
                )

with open("test2.pdf", "wb") as output_stream:
    writer.write(output_stream)

Share here the PDF file(s) that cause the issue. The smaller they are, the
better. Let us know if we may add them to our tests!

f1040.pdf

Traceback

This is the complete traceback I see:

Traceback (most recent call last):
  File "/Users/brooks.watson/PdfGenerator/PyPdfTest.py", line 16, in <module>
    writer.update_page_form_field_values(
  File "/Users/brooks.watson/PdfGenerator/venv/lib/python3.9/site-packages/pypdf/_writer.py", line 977, in update_page_form_field_values
    self._update_field_annotation(writer_parent_annot, writer_annot)
  File "/Users/brooks.watson/PdfGenerator/venv/lib/python3.9/site-packages/pypdf/_writer.py", line 800, in _update_field_annotation
    dr = dr.get_object().get("/Font", DictionaryObject()).get_object()
AttributeError: 'dict' object has no attribute 'get_object'

Process finished with exit code 1

For reference, adding the following code to the _writer.py file fixes the issue:

        # Retrieve font information from local DR ...
        # Original code
        dr: Any = cast(
            DictionaryObject,
            cast(
                DictionaryObject,
                anno.get_inherited(
                    "/DR",
                    cast(
                        DictionaryObject, self.root_object[CatalogDictionary.ACRO_FORM]
                    ).get("/DR", DictionaryObject()),
                ),
            ).get_object(),
        )

        # New Code
        if "/Font" not in dr or not isinstance(dr["/Font"], DictionaryObject):
            dr[NameObject("/Font")] = DictionaryObject()

        font_dict = dr["/Font"]

        # Check if the specific font (e.g., Helvetica) is in /Font
        font_name = NameObject("/Helvetica")
        if font_name not in font_dict:
            font_entry = DictionaryObject({
                NameObject("/Type"): NameObject("/Font"),
                NameObject("/Subtype"): NameObject("/Type1"),
                NameObject("/BaseFont"): NameObject("/Helvetica"),
                NameObject("/Encoding"): NameObject("/WinAnsiEncoding")
            })
            font_dict[font_name] = font_entry

        #Original code
        dr = dr.get("/Font", DictionaryObject()).get_object()
@stefan6419846 stefan6419846 added the workflow-forms From a users perspective, forms is the affected feature/workflow label May 22, 2024
BrooksWatson717 added a commit to BrooksWatson717/pypdf that referenced this issue May 22, 2024
Updates the _writer.py to create the /DR and /Font dictionaries, and add a font (Helvetica) if they don't exist. This enables filling out PDF forms.
BrooksWatson717 added a commit to BrooksWatson717/pypdf that referenced this issue May 22, 2024
- Added test
- Cleaned up code (simplified and fixed variable name)
@pubpub-zz
Copy link
Collaborator

pubpub-zz commented May 26, 2024

first this PDF is very odd : out of the /DR entries missing, the names ("/T") contains . which is banned by PDF spec.
also a warning/(error ?) is reported: Ignoring wrong pointing object 1772 0 (offset 0)

About the missing fonts, we can cope with this situation if the font belongs to the 14 standard Type 1 fonts

I will propose an alternative PR with this solution

@pubpub-zz
Copy link
Collaborator

I will propose an alternative PR with this solution

my test code and output file:

import pypdf
w=pypdf.PdfWriter("f1040.pdf")
for page in w.pages:
    for annot in page.annotations:
        annot = annot.get_object()
        if annot.get('/FT','') == "/Tx":
            fieldName = annot["/T"]
            w.update_page_form_field_values(
                page,
                {fieldName: "Brooks"},
                auto_regenerate=False,
            )
w.write("out.pdf")

out.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
workflow-forms From a users perspective, forms is the affected feature/workflow
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants