Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.4] Add xgboost histogram_based_v2 #2382

Merged
merged 2 commits into from
Mar 7, 2024

Conversation

YuanTingHsieh
Copy link
Collaborator

@YuanTingHsieh YuanTingHsieh commented Mar 5, 2024

Description

  • Add xgboost histogram_based_v2 codes
    Examples will come in another PR.

Conceptually, we use the following paradigm:

Screenshot 2024-03-05 at 5 23 49 PM

Detailed Implementation

The open-source Federated XGBoost (c++) uses gRPC as the communication protocol.
To use FLARE as the communicator, we simply route XGB’s gRPC messages through FLARE.
To do so, we change the server endpoint of each XGB client to a local gRPC server (LGS) within the FLARE client.
Similarly, there is a local GRPC Client (LGC) on the FL Server that interacts with the XGB Server.

As illustrated below:

Screenshot 2024-03-06 at 1 36 37 PM

The message path between the XGB Client and the XGB Server is as follows:

  • The XGB client generates a gRPC message and sends it to the LGS in FLARE Client
  • FLARE Client forwards the message to the FLARE Server. This is a reliable FLARE message.
  • FLARE Server uses the LGC to send the message to the XGB Server.
  • XGB Server sends the response back to the LGC in FLARE Server.
  • FLARE Server sends the response back to the FLARE Client.
  • FLARE Client sends the response back to the XGB Client via the LGS.

Code structure explanation

We introduce the following classes:

  • AdaptorController: The workflow used on the FL server side to invoke server side adaptor
  • AdaptorExecutor: The executor used on the FL client side to invoke client side adaptor
  • XGBAdaptor: XGBAdaptors are used to integrate FLARE with XGBoost Target (Server or Client) in run time.
  • XGBRunner: An XGBRunner implements XGB (server or client) processing logic. So XGBoost related federated API call will be here. How to invoke "xgb.train" is also in runner.
  • GrpcServer, GrpcClient, proto folder: they exist to communicate with the open-source Federated XGBoost (c++) grpc server, so we take the protobuf definition from xgboost repo (https://github.com/dmlc/xgboost/blob/v2.0.3/plugin/federated/federated.proto)
  • Sender: wraps around to send aux_message to the FL server side, will be enhanced later.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Quick tests passed locally by running ./runtest.sh.
  • In-line docstrings updated.
  • Documentation updated.

@chesterxgchen
Copy link
Collaborator

any unit tests ?

Copy link
Collaborator

@yanchengnv yanchengnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all secure-related changes. Even if secure may be needed in the future (very unlikely), the way you did may not be the best way.

Copy link
Collaborator

@yanchengnv yanchengnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YuanTingHsieh
Copy link
Collaborator Author

/build

@YuanTingHsieh
Copy link
Collaborator Author

Will add unit test / integration tests / example in the next PR

@YuanTingHsieh YuanTingHsieh merged commit 33360d9 into NVIDIA:2.4 Mar 7, 2024
16 checks passed
@YuanTingHsieh YuanTingHsieh deleted the add_xgboost_histogram_v2 branch March 7, 2024 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants