Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does DSL-JSON Scale? #257

Open
emanodame opened this issue Jun 18, 2023 · 5 comments
Open

Does DSL-JSON Scale? #257

emanodame opened this issue Jun 18, 2023 · 5 comments

Comments

@emanodame
Copy link

emanodame commented Jun 18, 2023

I am considering using DSL-JSON for my project. I am curious on if DSL-JSON works well with high concurrent workflows?

Specifically, hundreds/thousands of TPS (web server) deserialising a large JSON file (40~MB)?

Are there any load tests available or is anyone using this lib to this scale?

Doing some quick experimentation and looks like at 5-10 concurrent threads DSL JSON struggles to deserialise when all threads are trying to deserialise the 40MB file. Example of JSON object:

Class Holder(map: Map<String, CustomObject>)
Class CustomObject {
    field: String,
    field2: String,
    field3: String,
    field4: String,
    field5: String,
    pointers: List<CustomObject>
}
@zapov
Copy link
Member

zapov commented Jun 19, 2023

Hi,

DSL-JSON should work fine with high concurrent workloads. There are even few tweaks you can do to avoid thread locals and gain some additional performance.
But if you are hitting some limits, its most likely some Java/GC related issues.
And in that case its always best to profile whats going on and see where the bottleneck is. Its unlikely its in DSL-JSON, but if is there... we can certainly look how to fix/work around it.

@emanodame
Copy link
Author

Thanks for the response @zapov.

Can you please write down some of the tweaks you think will improve performance?
Feel free to link another post or example code.

I have done some profiling in which I saw a hotspot/bottleneck in the readContent method in DslJsonConverter class. With 1 TPS, deserialisation speeds are < 100ms, however with 30+ TPS, I see times of 1-3 seconds to deserialise.

Note, this is deserilisation of a 40MB byte array. 16GB RAM, 2VCPU. If you wish, I can share further benchmarking statistics.

@zapov
Copy link
Member

zapov commented Jun 20, 2023

You can find various suggestions in the readme but I doubt that will help you.
It sounds like your deserialization takes too long and you have too few CPUs to parallelize it.
Usual suggestions are to:

  • use byte[] instead of some "smart" structure like ByteBuffer
  • reuse this byte[]/streams instead of allocating them
  • prefer local JsonReaders instead of thread locals
  • avoid strings in favor of more exact data types

but nothing from that will help if you if your single deserialization takes too long and produces too much garbage.
Anyway... I doubt this has anything to do with DSL-JSON but you can paste YourKit profile or something like that.

@emanodame
Copy link
Author

emanodame commented Jun 20, 2023

I have added an example of a Profiler CPU Capture. As you can see the read/readContent are the hotspots.

As stated above, with 1 or 2 Threads concurrently running and executing deserialise every second, there is excellent performance < 100ms. When increasing this to 30 in running in ECS Fargate, it takes 1-3 seconds for each deserialisation operation.

@zapov
Copy link
Member

zapov commented Jun 20, 2023

If this is 2 CPU machine I would assume 2-4 threads would be optimal size, after that context switching might be hurting you too much.
So I would look into slowly scaling it and seeing when there is spike in performance degradation.
8% in getNextToken sounds like reading byte is taking too much time which I would assume is mostly switching this 40 MB structure across threads and reading byte off it.

Anyway, good luck with performance profiling... and figuring out where this performance bottleneck is coming from. If you need my further help you can ask for consulting contract via email.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants