Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyPDF some fields not showing in generated PDF #2724

Closed
thusharagokulnath opened this issue Jun 25, 2024 · 27 comments · Fixed by #2729
Closed

PyPDF some fields not showing in generated PDF #2724

thusharagokulnath opened this issue Jun 25, 2024 · 27 comments · Fixed by #2729

Comments

@thusharagokulnath
Copy link

thusharagokulnath commented Jun 25, 2024

Hi,

Can someone please help with this

I am generating few PDF using py PDF pypdf==4.2.0
This is the input PDF
FRA F 6180.55.pdf

The marked fields on the image are not filling , but when I get the getfields() of writer and check, there is value(\V) for the fields
image

Few example getfields data:

'YARDMILE': {'/T': 'YARDMILE', '/FT': '/Tx', '/TU': 'Yard Switching Train Miles', '/V': '28800', '/AA': {'/F': IndirectObject(57, 0, 2677080479392), '/K': IndirectObject(58, 0, 2677080479392)}},
'WORKERHR': {'/T': 'WORKERHR', '/FT': '/Tx', '/TU': 'Railroad Worker Hours', '/V': '811371', '/AA': {'/F': IndirectObject(65, 0, 2677080479392), '/K': IndirectObject(66, 0, 2677080479392)}},

Below is the generated PDF
FRA F 6180.55_2018-05-01_2018-05-31_2024-06-25_14-31-00.pdf

TIA

@thusharagokulnath
Copy link
Author

One thing I found about it is there is data on the field. It display data when I click on the field , it disappear just show the blue editable field when I move the cursor.

For me I am generating the PDF as readonly in above examples, there I cannot see the data at all

@pubpub-zz
Copy link
Collaborator

Your form is in readonly. Can you remove this restriction for analysis

@thusharagokulnath
Copy link
Author

thusharagokulnath commented Jun 25, 2024 via email

@pubpub-zz
Copy link
Collaborator

I've seen it but I would like to see how acrobat modifies the fields. In pdf.js the fields are visibles

@calixteman
Copy link

With pdf.js the values of the fields are correct because we use their V entry. But if you try to print the pdf then the indicated fields are empty, so it's likely an issue in the appearance stream of those annotations (Acrobat and Chrome use the AS).

@thusharagokulnath
Copy link
Author

Below is the editable pdf generated
FRA F 6180.55_2022-09-01_2022-09-30_2024-06-25_17-35-06.pdf

Looking forward for a solution asap. It is a blocker for my project deployment

Thanks

@pubpub-zz
Copy link
Collaborator

I've understood : the boxes ("/Rect") are upside down Rect[1] is greater than Rect[3]. Just checking but a fix/patch should be available soon

@thusharagokulnath
Copy link
Author

Thank you

@pubpub-zz
Copy link
Collaborator

can you try to modify _writer.py line 879:

        for line_number, line in enumerate(txt.replace("\n", "\r").split("\r")):
            if line in sel:
                # may be improved but cannot find how to get fill working => replaced with lined box
                ap_stream += (
                    f"1 {y_offset - (line_number * abs(font_height) * 1.4) - 1} {abs(rct.width) - 2} {abs(font_height) + 2} re\n" #<------------
                    f"0.5 0.5 0.5 rg s\n{da}\n"
                ).encode()

@thusharagokulnath
Copy link
Author

what need to be modified? I am new to python, just trying understand

@pubpub-zz
Copy link
Collaborator

pubpub-zz commented Jun 26, 2024

start the python idle,
press alt+M
open "pypdf._writer.py"
press alt+G
goto line 879
it should starts with :
f"1 {y_offset -
replace it with

                    f"1 {y_offset - (line_number * abs(font_height) * 1.4) - 1} {abs(rct.width) - 2} {abs(font_height) + 2} re\n" #<---------

save file, rerun your script. check result

@thusharagokulnath
Copy link
Author

Thank you.
This line of code is in line 843, so got confused.
Even though I tried changing it and rerun the script. but not worked. still the same

@pubpub-zz
Copy link
Collaborator

at line 839, you can find:
ap_stream = f"q\n(...)
replace it with

        ap_stream = f"q\n/Tx BMC \nq\n1 1 { abs(rct.width) - 1} {abs(rct.height) - 1} re\nW\nBT\n{da}\n".encode()

@thusharagokulnath
Copy link
Author

Tried it. Still no luck

Thanks

@pubpub-zz
Copy link
Collaborator

can you provide some test code please

@thusharagokulnath
Copy link
Author

thusharagokulnath commented Jun 26, 2024

import json
import getdata
from ReadConfig import read_config
from pypdf import PdfReader,PdfWriter
import argparse
import os
from datetime import datetime
from pypdf.constants import UserAccessPermissions as UAP


todayDate = datetime.today().strftime("%Y-%m-%d_%H-%M-%S")

config = read_config()

input_pdf  = f""+config['FilePaths']['input_pdf']+"FRA F 6180.55.pdf"

def fill_pdf():
    try:  
        merger = PdfWriter()  
        merger.encrypt(user_password="",owner_password="",permissions_flag=UAP.PRINT) 
        reader = PdfReader(input_pdf)
        writer = PdfWriter(clone_from=reader)
        writer.set_need_appearances_writer(True)  
        writer.encrypt(user_password="",owner_password="",permissions_flag=UAP.PRINT)        
        
        for page in writer.pages:   
            writer.update_page_form_field_values(
                page,
                {
                "RailroadName": "A" ,
                "RailroadCode": "B" ,
                "ReportMonthYear": "06/2024" ,
                "County": "MX" ,
                "ReportingOfficerName": "C" ,
                "ReportingOfficerTitle": "D" ,
                "ReportingOfficerAddress": "E" ,
                "ReportingOfficerTelephoneNumber": "F" ,
                "Date1": "" ,
                "Date2": "" ,
                "FreightTrainMiles": "2323" ,
                "PassengerMilesOperated": "34324"   ,
                "YARDMILE": "23324" ,
                "OtherTrainMiles": "45435" ,
                "WORKERHR": "5346" ,
                "PassengerTrainMiles": "65746",
                "PassengersTransported": "4653" ,
                "FatalEmployee": "3" ,
                "FatalEmployeeNotOnDuty": "1" ,
                "FatalPassengers": "4" ,
                "FatalNontrespassers": "5" ,
                "FatalTrespassers": "6" ,
                "FatalContractor": "8" ,
                "FatalContractorOher": "34" ,
                "FatalVolunteer": "64" ,
                "FatalVolunteerOther": "3" ,
                "FatalNonTrespassersOffRailroad": "34" ,
                "FatalGrandTotal": "35" ,
                "NonFatalEmployee": "3" ,
                "NonFatalEmployeeNotOnDuty": "5" ,
                "NonFatalPassenger": "5" ,
                "NonFatalNontrespassers": "34" ,
                "NonFatalTrespassers": "3" ,
                "NonFatalVolunteerOther": "43" ,
                "NonFatalContractor": "2" ,
                "NonFatalContractorOther": "64" ,
                "NonFatalVolunteer": "2" ,
                "NonFatalNonTrespassersOffRailroad": "23" ,
                "NonFatalGrandTotal": "23" ,
                "F618054": "2" ,
                "F618055A": "3",
                "F618056": "3",
                "F618057": "4" ,
                "F618081": "5" ,
                "Description1": "D" ,
                "Description2": "E" ,
                "Description3": "F" ,
                "Description4": "G" ,
                "Description5": "H",
                "State": "NJ"
                },        
                auto_regenerate = False
            )
                
            merger.append(writer)
        output_pdf= f""+config['FilePaths']['output_pdf']+"FRA F 6180.55_1.pdf"
        with open(output_pdf  , "wb") as output_stream:
            merger.write(output_stream) 
    except Exception as e:
        print(e)

fill_pdf()

Below is the input file
FRA F 6180.55.pdf

@thusharagokulnath
Copy link
Author

Any update?

@pubpub-zz
Copy link
Collaborator

let me have dinner!😫

@thusharagokulnath
Copy link
Author

Sorry for pushing . It is a blocker for my project.

@pubpub-zz
Copy link
Collaborator

pubpub-zz commented Jun 27, 2024

We are only volunteers spending spare time for the benefit of the community....
a few more abs shall be added:
line 773:

                font_height = abs(rct.height) - 2

line 776:

        y_offset = abs(rct.height) - 1 - font_height

@pubpub-zz
Copy link
Collaborator

any update ?

@thusharagokulnath
Copy link
Author

No luck. Still same

@pubpub-zz
Copy link
Collaborator

working for me 🤔
can you send me your output file ?

@thusharagokulnath
Copy link
Author

FRA F 6180.55_1.pdf

@pubpub-zz
Copy link
Collaborator

I think it should be good now : I missed the /BBox field
for line 745

        rct = RectangleObject((0, 0, abs(_rct[2] - _rct[0]), abs(_rct[3] - _rct[1])))

@thusharagokulnath
Copy link
Author

Wow! Perfect. That worked. Thank you so much

@pubpub-zz
Copy link
Collaborator

This should be part of next release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants