-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DHT performance degrades with more values in the ring #128
Comments
Running the same setup now for 2 days straight. Max values go up as far as 190 seconds! |
Thanks for the report. Can you try the latest 5.0 release? Its still beta, but more stable than 4.4. Thanks. |
You mean beta8? We are currently working on including it. I will post results as soon as we have some. An update on the issue: After approx. 50k datasets, the DHT was "full", so we stopped the test. "Full", meaning requests took like 20 minutes (!!!) regardless of whether we tried to read or write. I figure that was mainly due to RAM limitations, as our nodes all feature just 1GB of memory. We logged all data we pushed to the ring and reached approx. 1 GB of data being logged around 50k datasets. So this might be the explanation for this. Anyhow, looking into the log files, the following line showed up a lot:
In the meantime: Here is some data from the test i ran. I calculated average, minimum, maximum, and median request times for each 1000 requests: The missing max-value was 227.5485981 and I deleted it to make the chart readable. After 50k requests, data got MUCH worse... |
Have you used disk-based storage? |
Hi,
I stumbled upon an issue with the ring when it's being flooded with values. I noticed, when there are more values in the ring, the reaction time increases dramatically.
Setup: I have 3 virtual nodes running. Not very performant ones, but oh well... They are running Debian Wheezy. TomP2P runs in version 4.4 (via Maven), a Jetty receives REST requests with to be written to the ring.
Now I wrote a small shell script that pushes a key-value pair to one of the nodes in a loop, i.e. I am writing as much and as fast as I can to the ring. The script then measures and logs the time needed for such a request to complete.
Results: At first, the requests times are ok. Something like an average of 1.0 seconds, a median of 0.99, and a maximum of 1.3. Interestingly, there is an "recurring outlier": Approx. every 50th request takes significantly more time to complete (like 1.5 seconds in the beginning).
Observing this for a few thousand requests, the average and median request times remain close to 1.0 to 1.2 seconds, while the request duration of this "recurrent outlier" increases linear! After as little as 5k requests we are talking about a duration of 3.8 seconds already!
Apparently, with an increasing number of values being written to the ring, the performance changes for the worse. Big time! After approx 33k requests, the outliers take as much as (up to) 75 seconds (!!!!!) to complete, while the median duration remains close to what it was in the beginning: The median is still at 1.01 seconds (!!!!!), while the average increased to 1.8 seconds (mainly due to the outlier I guess).
Is this a known issue?
Graph:
raw data: requesttimes.txt
The text was updated successfully, but these errors were encountered: