You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think this is a bug, but it may also be badly labeled annotation links in a PDF. I'm logging it in case others find the same issue, or the underlying cause has additional implications. It is causing numerous errors like this: edu.harvard.hul.ois.jhove.module.pdf.PdfInvalidException: Invalid indirect destination - referenced object 'bm_b11' cannot be found in a particular set of PDFs. In this instance the problem seems to be that there is also a 'bm_b1' reference as well as 'bm_b11'.
The problem stems from the position of this line in the Literal class:
A character is appended to _rawBytes regardless of whether it is a close parentheses that should end the Literal. That means the _rawBytes value ends with ",41" (close parentheses).
Where this causes problems is when performing NameTreeNode.compareKey() here:
The compare process uses the _rawBytes - for comparison it basically truncates the longer key to a shorter key. If the two being compared are a different length, it will not reach the ",41"/ close paren of the longer _rawBytes, but it will be the last character compared on the end of the shorter one. That means if the character that is compared to the close paren falls below 41, it returns -1 and exits the matching loop. This would be fine if the character was alpha-numeric, but in this case the character being compared is a null and so the annotation links, which are functioning when rendered, are causing validation errors in JHOVE.
Example of the problem reference:
I think this should be valid?
To confirm this was causing the problem, I did a quick hack to move this line:
... to the last line of the for-loop since a close paren will cause a return offset; and the character will not be appended to the _rawBytes. Moving that line caused the error messages to stop.
The text was updated successfully, but these errors were encountered:
Proposed fix for openpreserve#696 - Unless there is a reason to include a close paren in the rawbytes output, I think the rawbytes.add should be at the end of the loop.
I think this is a bug, but it may also be badly labeled annotation links in a PDF. I'm logging it in case others find the same issue, or the underlying cause has additional implications. It is causing numerous errors like this:
edu.harvard.hul.ois.jhove.module.pdf.PdfInvalidException: Invalid indirect destination - referenced object 'bm_b11' cannot be found
in a particular set of PDFs. In this instance the problem seems to be that there is also a 'bm_b1' reference as well as 'bm_b11'.The problem stems from the position of this line in the Literal class:
jhove/jhove-modules/pdf-hul/src/main/java/edu/harvard/hul/ois/jhove/module/pdf/Literal.java
Line 164 in 47f077f
A character is appended to
_rawBytes
regardless of whether it is a close parentheses that should end the Literal. That means the_rawBytes
value ends with ",41" (close parentheses).Where this causes problems is when performing
NameTreeNode.compareKey()
here:jhove/jhove-modules/pdf-hul/src/main/java/edu/harvard/hul/ois/jhove/module/pdf/NameTreeNode.java
Line 122 in 47f077f
The compare process uses the
_rawBytes
- for comparison it basically truncates the longer key to a shorter key. If the two being compared are a different length, it will not reach the ",41"/ close paren of the longer_rawBytes
, but it will be the last character compared on the end of the shorter one. That means if the character that is compared to the close paren falls below 41, it returns -1 and exits the matching loop. This would be fine if the character was alpha-numeric, but in this case the character being compared is a null and so the annotation links, which are functioning when rendered, are causing validation errors in JHOVE.Example of the problem reference:
![image](https://user-images.githubusercontent.com/4070836/143372527-7a63f80b-e29d-4525-b88e-69da13e0b313.png)
I think this should be valid?
To confirm this was causing the problem, I did a quick hack to move this line:
jhove/jhove-modules/pdf-hul/src/main/java/edu/harvard/hul/ois/jhove/module/pdf/Literal.java
Line 164 in 47f077f
... to the last line of the for-loop since a close paren will cause a
return offset;
and the character will not be appended to the _rawBytes. Moving that line caused the error messages to stop.The text was updated successfully, but these errors were encountered: