Skip to content

Significant performance regression in 1.0.80 #721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
felixkrull-neuland opened this issue Apr 21, 2023 · 8 comments · Fixed by #731 or #790
Closed

Significant performance regression in 1.0.80 #721

felixkrull-neuland opened this issue Apr 21, 2023 · 8 comments · Fixed by #731 or #790
Assignees

Comments

@felixkrull-neuland
Copy link

We're seeing a significant performance regression after updating from 1.0.79 to 1.0.80 when validating large JSON documents ("large" = up to 300 MB non-pretty-printed). This regression is so severe that we had to revert to 1.0.79 because it makes 1.0.80 unusable for us.

  • Using 1.0.79, validating our largest document (330 MB) takes ~10 seconds on my dev laptop. This matches my expectations based on the max runtime metrics we're getting from production.
  • Using 1.0.80, I can't tell you how long validating the same document takes because it hasn't finished after 30 minutes.
  • Using 1.0.80, our metrics show spikes up to several hours, and that's just the runs that weren't interrupted by something. I don't even know if those hours are the really big documents or just the moderately big ones, so the top end might be even worse.

The regression is also clearly visible with https://github.com/networknt/json-schema-validator-perftest, even if less extreme (both runs are on my dev laptop using the same JVM etc):

  • 1.0.79:
Initial validation :11 ms
Iteration 0 (in 4 ms)
Iteration 20 (in 58 ms)
Iteration 40 (in 92 ms)
Iteration 60 (in 127 ms)
Iteration 80 (in 164 ms)
Iteration 100 (in 199 ms)
Iteration 120 (in 221 ms)
Iteration 140 (in 242 ms)
Iteration 160 (in 262 ms)
Iteration 180 (in 282 ms)
Iteration 200 (in 302 ms)
Iteration 220 (in 321 ms)
Iteration 240 (in 340 ms)
Iteration 260 (in 360 ms)
Iteration 280 (in 379 ms)
Iteration 300 (in 398 ms)
Iteration 320 (in 417 ms)
Iteration 340 (in 436 ms)
Iteration 360 (in 455 ms)
Iteration 380 (in 474 ms)
Iteration 400 (in 493 ms)
Iteration 420 (in 511 ms)
Iteration 440 (in 530 ms)
Iteration 460 (in 549 ms)
Iteration 480 (in 568 ms)
END -- time in ms: 586
  • 1.0.80:
Initial validation :25 ms
Iteration 0 (in 19 ms)
Iteration 20 (in 164 ms)
Iteration 40 (in 307 ms)
Iteration 60 (in 427 ms)
Iteration 80 (in 543 ms)
Iteration 100 (in 616 ms)
Iteration 120 (in 688 ms)
Iteration 140 (in 760 ms)
Iteration 160 (in 834 ms)
Iteration 180 (in 907 ms)
Iteration 200 (in 979 ms)
Iteration 220 (in 1051 ms)
Iteration 240 (in 1125 ms)
Iteration 260 (in 1197 ms)
Iteration 280 (in 1268 ms)
Iteration 300 (in 1339 ms)
Iteration 320 (in 1414 ms)
Iteration 340 (in 1485 ms)
Iteration 360 (in 1557 ms)
Iteration 380 (in 1629 ms)
Iteration 400 (in 1702 ms)
Iteration 420 (in 1773 ms)
Iteration 440 (in 1845 ms)
Iteration 460 (in 1917 ms)
Iteration 480 (in 1991 ms)
END -- time in ms: 2060

I profiled my test program and found that a lot of time is being spent on HashSet operations in CollectorContext.copyEvaluatedProperties and RefValidator.validate. I'm suspecting the problems were introduced by #714 since it touched exactly those files. I have some screenshots of IntelliJ's profiler results to illustrate this, but not much else unfortunately:
Screenshot of IntelliJ profiler call tree. Hash set operations in RefValidator.validate take up 74.9% of the runtime.
Screenshot of IntelliJ profiler flame graph, again showing lots of time spent in copyEvaluatedProperties and general hashsettery.

Unfortunately, I can't share the schema or the documents, and crafting a synthetic example would be a lot of work that I'd rather like to avoid.

(Yes, I did see #564, but to be quite honest I didn't want to lump in such a significant regression from one release to the next with a year-old issue, even if they are related.)

@stevehu
Copy link
Contributor

stevehu commented Apr 21, 2023

@felixkrull-neuland Thanks a lot for raising this issue. We will fix it and hope you can retest and verify the fix.

@fdutton fdutton self-assigned this Apr 21, 2023
@fdutton
Copy link
Contributor

fdutton commented Apr 21, 2023

@stevehu @felixkrull-neuland I'm investigating

@fdutton
Copy link
Contributor

fdutton commented Apr 24, 2023

@stevehu @felixkrull-neuland An update...

The HashSets were definitely a problem so I replaced them with ArrayLists but that only reduced the time about 33%. The other time sink was creating EnumSet for each keyword every time a schema is created (about 15 million times in the performance tests). I reworked this to cache the results and that reduced the time another 20-25%. Unfortunately, the overall time is still ~50% higher than what we get from 1.0.79. The best I can tell, this is due to the increased coverage obtained by not short-circuiting the anyOf keywords.

I'm currently investigating ways to optimize PropertiesValidator and AdditionalPropertiesValidator, which are consuming the bulk of the remaining time.

@fdutton
Copy link
Contributor

fdutton commented Apr 30, 2023

@felixkrull-neuland Please try 1.0.81 and let us know if this is still an issue.

@felixkrull-neuland
Copy link
Author

1.0.81 is better than 1.0.80, but it's still much slower than 1.0.79 for us, maybe two to three times slower on average and much worse worst-case behaviour.

My thinking right now is that we're gonna stick with 1.0.79 for now and find a way to not have to validate hundreds of megabytes of JSON all the time (which we were planning to do anyway). I appreciate the optimization efforts, but given the new version does more/is more correct AFAIU, I don't know if it's feasible to make validation as fast as we would like it to be.

@LemurP
Copy link

LemurP commented May 20, 2023

Thanks for providing this library as open source software for the world to use.
And thank you @felixkrull-neuland for the thorough issue description. Finding this issue helped us to locate our problems and arrive at the same solution as you, namely sticking with version 1.0.79.

We've seen the same significant regression from 1.0.79 to 1.0.80 and 1.0.81 in an application which parses between ten and a few hundred megabytes of JSON at a time.

This performance reduction is so significant that I think the description of this library in the README needs to be reworded significantly. I would be extremely surprised by the performance of version 1.0.81 if I chose this library based on the description in the README today.

I appreciate the performance improvements that have been made, and 1.0.81 is much better than 1.0.80 in that regard. But it is still not possible for us to use these two versions any more.

@stevehu
Copy link
Contributor

stevehu commented May 20, 2023

We want to resolve this issue, but it is very hard for us to replicate it as we don't have any example of a big schema or a big JSON for testing. Does anybody know where we can find a matching schema with a big JSON? Thanks.

@fdutton
Copy link
Contributor

fdutton commented May 28, 2023

@stevehu @felixkrull-neuland @LemurP I'm reopening this issue so that I can submit additional changes to address this issue.

What makes this difficult in addition to our not having good samples of large JSON instances, it that there is a lot of variability in json-schema-validator-perftest from one run to the next. For example, my previous pull-request showed a time that was close to twice as fast as 1.0.79 (280 ms) and that 1.0.80 was 50% faster. Obviously, the performance test needs improvement.

@felixkrull-neuland Can you provide an obfuscated version of a schema that is significantly slower? If so, I think I can generate JSON instances from random data.

We have added a lot of additional coverage from the JSON Schema Test Suite (~4000 tests now pass that did not before). The biggest change is the support for unevaluatedProperties and unevaluatedItems, which require the ability to transitively identify unevaluated properties and items no matter how deep in the JSON instance they exist. I suspect this may be the cause of your issue. We could add an option to disable these checks with the understanding that you are sacrificing fidelity for performance.

It would also help to know if you are enabling fail-fast or disabling schema caching.

@fdutton fdutton reopened this May 28, 2023
fdutton pushed a commit that referenced this issue May 29, 2023
stevehu pushed a commit that referenced this issue Jun 5, 2023
* Simplifies how evaluated properties and array items are tracked in order to improve performance.

Resolves #721

* Corrects issue with deserializing JSON Schema Test Suite tests.

Resolves #804

* Adds configuration parameters to disable unevaluatedItems and unevaluatedProperties analysis

---------

Co-authored-by: Faron Dutton <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants