Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller Refactor Part 1: separate communication #2390

Merged
merged 3 commits into from
Mar 8, 2024

Conversation

SYangster
Copy link
Collaborator

@SYangster SYangster commented Mar 7, 2024

  • separate out communication from controller with composition
  • WFCommServer (closely coupled with Responder pattern)
  • WFCommClient (uses aux channel -> will require new executor in follow up PR to support converting existing server-controlled workflows into client-controlled workflows)

Update: Responder: handle_dead_job() and ControllerSpec: process_result_of_unknown_task() are implemented by some Controller subclasses, but are needed by the WFCommServer. For now I passed the Controller to the WFCommServer, but maybe need a cleaner solution.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Quick tests passed locally by running ./runtest.sh.
  • In-line docstrings updated.
  • Documentation updated.

@SYangster
Copy link
Collaborator Author

/build

@SYangster SYangster marked this pull request as ready for review March 8, 2024 01:29
Copy link
Collaborator

@YuanTingHsieh YuanTingHsieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, added some comments.

nvflare/apis/impl/controller.py Show resolved Hide resolved
nvflare/apis/event_type.py Outdated Show resolved Hide resolved
nvflare/apis/impl/wf_comm_server.py Outdated Show resolved Hide resolved
nvflare/apis/impl/wf_comm_server.py Outdated Show resolved Hide resolved
nvflare/private/fed/server/server_json_config.py Outdated Show resolved Hide resolved
nvflare/private/fed/server/server_runner.py Outdated Show resolved Hide resolved
nvflare/apis/impl/controller.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@yanchengnv yanchengnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very impressive! I like it that all existing controllers don't need changes (other than change some imports).
We need to make the WFCommSpec more complete (see my comments). I understand that most of these are only specific to the server side (for now), but the concept is good and may still be applicable to other implementations. The default impl of these methods can raise exception NOT_IMPLEMENTED.

nvflare/apis/event_type.py Outdated Show resolved Hide resolved
nvflare/apis/impl/wf_comm_server.py Outdated Show resolved Hide resolved
nvflare/apis/wf_comm_spec.py Outdated Show resolved Hide resolved
nvflare/private/fed/server/server_runner.py Show resolved Hide resolved
nvflare/private/fed/server/server_runner.py Show resolved Hide resolved
nvflare/private/fed/server/server_runner.py Show resolved Hide resolved
nvflare/private/fed/server/server_runner.py Show resolved Hide resolved
tests/unit_test/apis/impl/controller_test.py Show resolved Hide resolved
tests/unit_test/apis/impl/controller_test.py Outdated Show resolved Hide resolved
@SYangster
Copy link
Collaborator Author

/build

Copy link
Collaborator

@yanchengnv yanchengnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SYangster SYangster merged commit 4f30f68 into NVIDIA:main Mar 8, 2024
16 checks passed
@SYangster SYangster mentioned this pull request Apr 25, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants