-
-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrading manticore from 6.0.4_230314.1a3a4ea82-1.el8 to 6.2.12_230822.dc5144d35-1.el8 results in significant cpu usage increase #1563
Comments
Hi
Can you show the graph please? |
Unfortunately I don't have graphs for it. However internally every search shows output for admins in console, and overall that output seems similar right now. |
Regarding I just booted ServerB to exclude kernel version differences causing the CPU usage different. Of note, both servers are now running the same bios version, mainboard firmware version, hardware and network specs, OS and kernel version. |
Here's what you can do for a fair comparison:
If after that you can still see that the new version consumes more CPU not improving performance, it will be indeed a problem we'd like to look further into. |
Conclusion: I highly suspect 6.2.12_230822.dc5144d35-1.el8 is slower. I have been running the servers with same setup, hardware, bios/firmware version, bios settings, OS version, and same weighted load balancing for weeks now. To see if the increased server load helped with query performance, I used the following command to take the last 1,000,000 queries "real" time in server logs and average them. tail -n 1000000 manticore_query.log > last_1000000_lines.log ServerA was upgraded from manticore server 6.0.4_230314.1a3a4ea82-1.el8 to 6.2.12_230822.dc5144d35-1.el8 during the black lines. ServerA result: 0.141038 Of note: |
Thanks for providing the additional info @digirave
How do the loads differ? If you have older query logs when the both servers were running 6.0.4, can you run the same script to compare the response times for that period? |
Conclusion: Before update to 6.2.12_230822.dc5144d35-1.el8, query times don't differ much between servers and non-upgraded server in the past actually had a miniscule but faster response. However, upgrade to 6.2.12_230822.dc5144d35-1.el8 seems to have caused significant increase in server load with slightly worse query performance.
They shouldn't differ long term. We load balance the same amount to both servers.
checked if one million queries were analyzed before further testing: commands ran for further testing checked if one million queries were analyzed: ServerA: 0.13555 |
Can you provide your table schema, a dozen of sample documents and your typical queries? I'd like us to benchmark that on our side. |
This would take some time and resources. |
Sure. [email protected] |
At this moment, this will take a lot of resources, so we wish to avoid as much as possible. I am not 100% sure, but I do think I found a workaround. pseudo_sharding = 0 seems to decrease server load considerably, while query performance seems comparable to other servers(too little difference to tell right now). Addtionally preliminary tests on prior versions, pseudo_sharding = 0 doesn't seem to have such a noticable effect(so little I can't tell right now). |
My current conclusion is pseudo_sharding in our case has a significant negative performance impact in version 6.2.12_230822.dc5144d35-1.el8 which is not visible in 6.0.4_230314.1a3a4ea82-1.el8 Workaround seems to be |
Here are screenshot of ServerB is 6.0.4_230314.1a3a4ea82-1.el8 in whole graph with default pseudo_sharding as default and also after changing to pseudo_sharding = 0 with no significant difference(probably small difference but hard to tell). |
@digirave BTW regarding
could you also check the 50p, 95p and 99p metrics? |
Statistics for default pseudo_sharding Server A version 6.2.12_230822.dc5144d35-1.el8 with default pseudo_sharding Server B version 6.0.4_230314.1a3a4ea82-1.el8 with default pseudo_sharding Statistics for pseudo_sharding = 0 Server A version 6.2.12_230822.dc5144d35-1.el8 with pseudo_sharding = 0 Server B version 6.0.4_230314.1a3a4ea82-1.el8 with pseudo_sharding = 0 Commands used
|
As discussed with @glookka, even though we say in the docs that when you do manual sharding, you can disable pseudo sharding, in this case (21 shards, presumably 25 cpu cores, |
Describe the bug
Upgrading manticore from 6.0.4_230314.1a3a4ea82-1.el8 to 6.2.12_230822.dc5144d35-1.el8 results in significant cpu usage increase
To Reproduce
Steps to reproduce the behavior:
max_threads_per_query = 20
threads = 200
Expected behavior
A clear and concise description of what you expected to happen.
Similar performance is expected
Describe the environment:
Linux [REDACTED] 4.18.0-477.27.1.el8_8.x86_64 #1 SMP Thu Aug 31 10:29:22 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
Messages from log files:
Messages from searchd.log and query.log (if applicable).
Additional context
We have two servers with exact hardware and exact same setup with manticore.
They create indexes independently of each but have the same setup.
They have queries sent to them in the same amount.
ServerA was upgraded from manticore server 6.0.4_230314.1a3a4ea82-1.el8 to 6.2.12_230822.dc5144d35-1.el8 during the black lines.
ServerB is manticore-server-6.0.4_230314.1a3a4ea82-1.el8.x86_64 in the whole graph
You can easily see ServerA has increased CPU load.
Query response time seem overall similar.
One difference between the servers is that ServerA was booted while ServerB has same rpm's for everything other than manticore but was not booted.
serverA:
serverB:
The text was updated successfully, but these errors were encountered: