PyPDF2 forever spinning at 100% CPU

I want to read this PDF file but PyPDF2 starts hanging forever spinning at 100% CPU while reading the PDF.

## Environment

Which environment were you using when you encountered the problem?

```bash
$ python -m platform
Linux-5.4.0-122-generic-x86_64-with-glibc2.29

$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.10.3
```
## Code + PDF

This is a minimal, complete example that shows the issue:

```python
import PyPDF2
with open("The lean times in the Peruvian economy.pdf", "rb") as f:
  pdfreader = PyPDF2.PdfFileReader(f, strict=True)
  npage = 0
  for page in pdfreader.pages:
    npage += 1
    print(f"Reading page {npage} of {pdfreader.numPages}")
    a = page.extractText()
```

PDF used above: [The lean times in the Peruvian economy.pdf](https://github.com/alexanderquispe/1REI05/blob/37bb0ca495584cf495dba6f1fcdcc9128a8581ec/reports/report_1/The%20lean%20times%20in%20the%20Peruvian%20economy.pdf)

## Output of the script
```
Reading page 1 of 19
Reading page 2 of 19
Reading page 3 of 19
Reading page 4 of 19
Reading page 5 of 19
Reading page 6 of 19
Reading page 7 of 19
Reading page 8 of 19
Reading page 9 of 19
Reading page 10 of 19
Reading page 11 of 19
Reading page 12 of 19
Reading page 13 of 19
Reading page 14 of 19
Reading page 15 of 19
Reading page 16 of 19
```
At this point, the script starts spinning at 100% CPU for more than half an hour when I manually terminated it.

## Preliminary code analysis
The code is spinning in this loop:
https://github.com/py-pdf/PyPDF2/blob/84460f54aa4721db36452fe510f8063838e358d5/PyPDF2/_cmap.py#L273-L282
with very large value of `b = 438093348969`. After roughly one minute `a` grew by `9430662` suggesting this loop would running for more than **32 days**. For any other page in this PDF, `b` never exceeds `0xFFFD` which would make this loop finish in about 0.4s.

The lack of comments and the inconclusive variable names prevent any further debugging attempts from my side but, hopefully, this gives the maintainers a hint to what they should be looking at.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PyPDF2 forever spinning at 100% CPU #1285

Environment

Code + PDF

Output of the script

Preliminary code analysis

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PyPDF2 forever spinning at 100% CPU #1285

Description

Environment

Code + PDF

Output of the script

Preliminary code analysis

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions