Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factory for SAXParser parsing XML without DOCTYPE #632

Merged
merged 1 commit into from
Jul 10, 2023

Conversation

jukzi
Copy link
Contributor

@jukzi jukzi commented Jun 23, 2023

External entity resolution is not supported by PDE (see PDECoreMessages.XMLErrorReporter_ExternalEntityResolution) but still the SAXParser did follow external links where DefaultHandler.resolveEntity was not overwritten.
At many places PDE already overwrote DefaultHandler.resolveEntity to prevent external resolution. With the new configuration that method is not even called anymore.

This change offers and uses a configuration that

  • reports an Exception if .xml contains DOCTYPE or
  • does just ignore external references (as a fall back if the exception would show up to cause trouble).

Also the caching of used parsers in possibly other threads is removed because the SAXParser is not guaranteed to be thread-safe. Only the factory is reused, because that is effectively final after creation. Reusing SAXParser is not a big help nowadays - see XmlParserFactoryTest.main(String[]) for performance test. In my measurement successive parser creations takes only ~ 0.06 ms.

@github-actions
Copy link

github-actions bot commented Jun 23, 2023

Test Results

     252 files  +  3       252 suites  +3   57m 2s ⏱️ - 10m 52s
  3 293 tests +  4    3 266 ✔️ +  4  24 💤 ±0  1 +1  2 🔥  - 1 
10 176 runs  +12  10 101 ✔️ +12  72 💤 ±0  1 +1  2 🔥  - 1 

For more details on these failures and errors, see this check.

Results for commit 377b2a4. ± Comparison against base commit e5217a8.

♻️ This comment has been updated with latest results.

@jukzi jukzi force-pushed the xmlparser branch 3 times, most recently from 1b3a7d9 to c3eaa6e Compare July 10, 2023 09:50
@jukzi
Copy link
Contributor Author

jukzi commented Jul 10, 2023

random fails:
BundleVersionTests #553
PlainJavaProjectTest #555

@jukzi
Copy link
Contributor Author

jukzi commented Jul 10, 2023

random failing ApiDescriptionProcessorTests #647

External entity resolution is not supported by PDE (see
PDECoreMessages.XMLErrorReporter_ExternalEntityResolution) but still the
SAXParser did follow external links where DefaultHandler.resolveEntity
was not overwritten.
At many places PDE already overwrote DefaultHandler.resolveEntity to
prevent external resolution. With the new configuration that method is
not even called anymore.

This change offers and uses a configuration that
* reports an Exception if .xml contains DOCTYPE or
* does just ignore external references (as a fall back if the exception
would show up to cause trouble).

Also the caching of used parsers in possibly other threads is removed
because the SAXParser is not guaranteed to be thread-safe. Only the
factory is reused, because that is effectively final after creation.
Reusing SAXParser is not a big help nowadays - see
XmlParserFactoryTest.main(String[]) for performance test.
In my measurement successive parser creations takes only ~ 0.06 ms.
@jukzi
Copy link
Contributor Author

jukzi commented Jul 10, 2023

ignoring random failing tests

@jukzi jukzi merged commit c8754be into eclipse-pde:master Jul 10, 2023
@jukzi jukzi deleted the xmlparser branch July 10, 2023 12:14
@HannesWell
Copy link
Member

Nice job on this and the other PRs regarding XML-parser creation.

Since the XmlTransformerFactory and XmlDocumentBuilderFactory class only contain one not too big method, I think it would be good to move them to the XmlParserFactory introduced in this change. The method names are already very expressive so I don't think it is necessary to have them in a dedicated class.

}
assertTrue(firstLine, firstLine.startsWith("GET"));
assertFalse("Server received secret: " + firstLine, firstLine.contains("secret")); // var3
assertFalse("Server was contacted", true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is always throwing an exception.
Can you tell why the test passes nevertheless?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One test expects an exception (but does not come so far because of another exception) while another test makes sure the code is not called at all.

@jukzi
Copy link
Contributor Author

jukzi commented Jul 12, 2023

Since the XmlTransformerFactory and XmlDocumentBuilderFactory class only contain one not too big method, I think it would be good to move them to the XmlParserFactory introduced in this change. The method names are already very expressive so I don't think it is necessary to have them in a dedicated class.

#667

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants