You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running CERMINE through 10000 or so pdfs, but some of them throw this error, and the program stops running. Can I somehow fix this, or tell CERMINE to skip errors an continue?
File processed: /home/moritz/Desktop/pdf_extraction/pdfs/Other/��ztun et al. 1991 - A new haloether from Laurencia possessing a lauroxacyclododecane ring. Structure and conformational studies.pdf Exception in thread "main" java.lang.IllegalArgumentException: Illegal group reference: group index is missing at java.util.regex.Matcher.appendReplacement(Matcher.java:819) at pl.edu.icm.cermine.content.cleaning.ContentCleaner.cleanHyphenationAndBreaks(ContentCleaner.java:180) at pl.edu.icm.cermine.content.cleaning.ContentCleaner.cleanAllAndBreaks(ContentCleaner.java:236) at pl.edu.icm.cermine.metadata.model.DocumentMetadata.clean(DocumentMetadata.java:277) at pl.edu.icm.cermine.metadata.EnhancerMetadataExtractor.extractMetadata(EnhancerMetadataExtractor.java:106) at pl.edu.icm.cermine.metadata.EnhancerMetadataExtractor.extractMetadata(EnhancerMetadataExtractor.java:36) at pl.edu.icm.cermine.ExtractionUtils.cleanMetadata(ExtractionUtils.java:101) at pl.edu.icm.cermine.InternalContentExtractor.doWork(InternalContentExtractor.java:341) at pl.edu.icm.cermine.InternalContentExtractor.doWork(InternalContentExtractor.java:320) at pl.edu.icm.cermine.InternalContentExtractor.getContentAsNLM(InternalContentExtractor.java:286) at pl.edu.icm.cermine.ContentExtractor.getContentAsNLM(ContentExtractor.java:612) at pl.edu.icm.cermine.ContentExtractor.getContentAsNLM(ContentExtractor.java:628) at pl.edu.icm.cermine.ContentExtractor.main(ContentExtractor.java:724)
Hi @mluerig
Thanks for reporting this. I just committed a bug fix, it should be included in the newest snapshot available here If you still experience any problems, please let us know.
I am running CERMINE through 10000 or so pdfs, but some of them throw this error, and the program stops running. Can I somehow fix this, or tell CERMINE to skip errors an continue?
File processed: /home/moritz/Desktop/pdf_extraction/pdfs/Other/��ztun et al. 1991 - A new haloether from Laurencia possessing a lauroxacyclododecane ring. Structure and conformational studies.pdf Exception in thread "main" java.lang.IllegalArgumentException: Illegal group reference: group index is missing at java.util.regex.Matcher.appendReplacement(Matcher.java:819) at pl.edu.icm.cermine.content.cleaning.ContentCleaner.cleanHyphenationAndBreaks(ContentCleaner.java:180) at pl.edu.icm.cermine.content.cleaning.ContentCleaner.cleanAllAndBreaks(ContentCleaner.java:236) at pl.edu.icm.cermine.metadata.model.DocumentMetadata.clean(DocumentMetadata.java:277) at pl.edu.icm.cermine.metadata.EnhancerMetadataExtractor.extractMetadata(EnhancerMetadataExtractor.java:106) at pl.edu.icm.cermine.metadata.EnhancerMetadataExtractor.extractMetadata(EnhancerMetadataExtractor.java:36) at pl.edu.icm.cermine.ExtractionUtils.cleanMetadata(ExtractionUtils.java:101) at pl.edu.icm.cermine.InternalContentExtractor.doWork(InternalContentExtractor.java:341) at pl.edu.icm.cermine.InternalContentExtractor.doWork(InternalContentExtractor.java:320) at pl.edu.icm.cermine.InternalContentExtractor.getContentAsNLM(InternalContentExtractor.java:286) at pl.edu.icm.cermine.ContentExtractor.getContentAsNLM(ContentExtractor.java:612) at pl.edu.icm.cermine.ContentExtractor.getContentAsNLM(ContentExtractor.java:628) at pl.edu.icm.cermine.ContentExtractor.main(ContentExtractor.java:724)
��ztun et al. 1991 - A new haloether from Laurencia possessing a lauroxacyclododecane ring. Structure and conformational studies.pdf
The text was updated successfully, but these errors were encountered: