Skip to content

ValueError: invalid literal for int() with base 10: b'' #1270

@DL6ER

Description

@DL6ER

See #1269 for further details, this reports another issue I've come accross.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-5.4.0-122-generic-x86_64-with-glibc2.29

$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.10.3

Code + PDF

This is a minimal, complete example that shows the issue:

import PyPDF2
with open("Introduction to Programming Using Python ( PDFDrive ).pdf", "rb") as f:
  pdfreader = PyPDF2.PdfFileReader(f, strict=True)
  metadata = pdfreader.metadata

PDF file used above: Introduction to Programming Using Python ( PDFDrive ).pdf

Traceback

This is the complete Traceback I see:

Traceback (most recent call last):
  File "test2.py", line 4, in <module>
    metadata = pdfreader.metadata
  File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_reader.py", line 327, in metadata
    obj = self.trailer[TK.INFO]
  File "/usr/local/lib/python3.8/dist-packages/PyPDF2/generic/_data_structures.py", line 150, in __getitem__
    return dict.__getitem__(self, key).get_object()
  File "/usr/local/lib/python3.8/dist-packages/PyPDF2/generic/_base.py", line 163, in get_object
    obj = self.pdf.get_object(self)
  File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_reader.py", line 1151, in get_object
    retval = read_object(self.stream, self)  # type: ignore
  File "/usr/local/lib/python3.8/dist-packages/PyPDF2/generic/_data_structures.py", line 822, in read_object
    return DictionaryObject.read_from_stream(stream, pdf, forced_encoding)
  File "/usr/local/lib/python3.8/dist-packages/PyPDF2/generic/_data_structures.py", line 269, in read_from_stream
    value = read_object(stream, pdf, forced_encoding)
  File "/usr/local/lib/python3.8/dist-packages/PyPDF2/generic/_data_structures.py", line 851, in read_object
    return NumberObject.read_from_stream(stream)
  File "/usr/local/lib/python3.8/dist-packages/PyPDF2/generic/_base.py", line 299, in read_from_stream
    return NumberObject(num)
  File "/usr/local/lib/python3.8/dist-packages/PyPDF2/generic/_base.py", line 274, in __new__
    val = int(value)
ValueError: invalid literal for int() with base 10: b''

Metadata

Metadata

Assignees

No one assigned

    Labels

    Has MCVEA minimal, complete and verifiable example helps a lot to debug / understand feature requestsis-robustness-issueFrom a users perspective, this is about robustnessnf-performanceNon-functional change: Performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions