Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network Resources #69

Merged
51 commits merged into from
Jul 2, 2022
Merged

Conversation

ryanolson
Copy link
Contributor

@ryanolson ryanolson commented Jun 25, 2022

This PR will:

  • drop requirement on ryanolson/libcudacxx and restore use of the latest release
  • simplify the memory_resource object used to the bare minimum until rapidsai/rmm and libcudacxx have a generalized memory_resource
  • construct ucx resources needed for data_plane services
  • construct host memory resources
  • construct device memory resources

resources::Manager is the component that constructs and destructs resources in the proper sequence. All runnable resources will be completed and joined when resource::Manager is destructed. The only type of object allowed to out live the resource::Manager are data/memory object. This is because they are ultimately backed by malloc/cudaMallocHost/cudaMalloc which are persistent resources throughout the life of the application. Data created by these resources with the pipeline and transfered back to the user application should outlive the Executor/resources::Manager.

See [/src/internal/resources/manager.hpp](resources::Manager definition) and follow the deconstruction sequence, i.e. the reverse order of the member variables as defined in the header.

Completion of the PR should:

@ryanolson ryanolson requested a review from a team as a code owner June 25, 2022 06:40
@ryanolson ryanolson self-assigned this Jun 25, 2022
@ryanolson ryanolson requested a review from a team as a code owner June 25, 2022 06:40
@ryanolson ryanolson changed the title UCX Resources Network Resources Jun 27, 2022
@ryanolson ryanolson added improvement Improvement to existing functionality non-breaking Non-breaking change improvement and removed improvement Improvement to existing functionality labels Jun 27, 2022
@ryanolson ryanolson marked this pull request as ready for review June 27, 2022 17:47
@ryanolson ryanolson mentioned this pull request Jun 28, 2022
@jarmak-nv jarmak-nv added the blocker Blocks another issue/PR label Jun 29, 2022
@ryanolson ryanolson added this to the Multi-Node Support milestone Jun 29, 2022
Copy link
Contributor

@drobison00 drobison00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly comments/questions and requests for additional unit tests.

include/srf/memory/buffer.hpp Show resolved Hide resolved
include/srf/memory/buffer_view.hpp Show resolved Hide resolved
src/internal/system/partition_provider.hpp Show resolved Hide resolved
src/internal/system/partitions.hpp Outdated Show resolved Hide resolved
src/internal/ucx/memory_block.hpp Show resolved Hide resolved
src/internal/ucx/registration_cache.hpp Show resolved Hide resolved
src/internal/ucx/registration_resource.hpp Show resolved Hide resolved
@ryanolson ryanolson requested a review from drobison00 July 1, 2022 04:09
@ryanolson
Copy link
Contributor Author

@drobison00 - marked all the "add top-level descriptions" as resolved as i completed them.

Added #109 to add test coverage. It will be easier for me to see what needs more coverage when #105 is merged.

@ryanolson
Copy link
Contributor Author

@gpucibot merge

@ghost ghost merged commit b576981 into nv-morpheus:branch-22.08 Jul 2, 2022
@ryanolson ryanolson deleted the network_resources branch July 2, 2022 03:26
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker Blocks another issue/PR non-breaking Non-breaking change
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[FEA] Enable UCX Context/Workers and UCX Registered Memory Resources [FEA] Re-Enable tests/test_memory.cpp
3 participants