-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low performance with dask #5785
Comments
We should document the scaling in practice. Also we should have logging options for rabit. |
@MajorCarrot, we added 2 PRs over 1.1 version, available in master only for now which improve performance of distributed mode with 'hist' tree method:
Could you, please, check the latest code form master? I suppose you should see some improvement. P.S. XGB 1.1.1 is released, it solves an issue with slower CPU version in PIP package (#5720). |
Thanks @SmirnovEgorRu I will try it out and give you an update |
System Details:
XGBoost:
|
@MajorCarrot, it would be nice to try |
Sorry for the extremely late reply. I have benchmarked with
1.2.0 is the nightly from aws as of 9th July 2020 System Details:
|
On a single node, using dask is slower than using the local interface due to the overhead of TCP socket. |
Hi,
I am trying out integration of our xgboost based model with dask here and am using the following function for the training:
I am not getting high CPU usage (only 30-40% on each core) and the training time is longer than the implementation without Dask!
Am I doing something wrong or is there a bug which can be fixed?
Also, the logs are populated with messages like so:
Is there a way to turn this off?
xgboost version: 1.1.0
The text was updated successfully, but these errors were encountered: