-
Notifications
You must be signed in to change notification settings - Fork 937
MEMHEAP/BASE: removed hard segments limit #10478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MEMHEAP/BASE: removed hard segments limit #10478
Conversation
hoopoepg
commented
Jun 17, 2022
- enabled dynamic segment's info allocation
- removed hard limit for segments count
9b5d882 to
ad59721
Compare
f0200bc to
8309ddf
Compare
| return NULL; | ||
| } | ||
|
|
||
| if (map->n_segments >= map->capacity) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this conflicts with the assert above (line 109)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this conflicts with the assert above (line 109)
no: when capacity == 0 n_segments must be 0 too, in other cases must be (n_segments < capacity). assert is not strongly valid but covers cases when memory corruption happened
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not checking for 0 here instead of comparing n_segments and capacity?
currently it looks quite confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fixed
| MEMHEAP_ERROR("failed to allocate segment"); | ||
| return OSHMEM_ERR_OUT_OF_RESOURCE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz make a cleanup of already allocated ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz make a cleanup of already allocated ones
ok, done
|
@yosefe ok to squash? |
4aa50cb to
87fd4b4
Compare
| if (map->n_segments == map->capacity) { | ||
| capacity = opal_min(mca_memheap_base_max_segments, | ||
| opal_max(map->capacity * 2, | ||
| MCA_MEMHEAP_MAX_SEGMENTS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think need to rename MCA_MEMHEAP_MAX_SEGMENTS to MCA_MEMHEAP_MIN_SEGMENTS
since this is actually the initial/minimal value of number of segments
| opal_list_t mca_memheap_base_components_opened = {{0}}; | ||
| int mca_memheap_base_already_opened = 0; | ||
| mca_memheap_map_t mca_memheap_base_map = {{{{0}}}}; | ||
| int mca_memheap_base_max_segments = MCA_MEMHEAP_MAX_SEGMENTS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need to limit it by default to 32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need to limit it by default to 32?
used as allowed number of segments to allocate, when limit is reached - warning prompted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's rename it (and respective configuration) to something like mca_memheap_num_segments_warn
since it's not a hard limit anymore, but just threshold for showing a warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, macro is removed, variable is renamed
| &mca_memheap_base_config.device_nic_mem_seg_size); | ||
|
|
||
| mca_base_var_register("oshmem", "memheap", "base", "max_segments", | ||
| "Maximum number of segments to register per process", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Maximum number of shared data segments per process",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Maximum number of shared data segments per process",
updated
| void* start; | ||
| void* end; | ||
| } mem_segs[MCA_MEMHEAP_MAX_SEGMENTS]; | ||
| memheap_static_segment_t *mem_segs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just allocate mca_memheap_base_max_segments up front?
this struct is small enough anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just allocate mca_memheap_base_max_segments up front?
this struct is small enough anyway
removed static segments allocation
|
@yosefe ok to squash? |
|
@yosefe ??? |
| opal_list_t mca_memheap_base_components_opened = {{0}}; | ||
| int mca_memheap_base_already_opened = 0; | ||
| mca_memheap_map_t mca_memheap_base_map = {{{{0}}}}; | ||
| int mca_memheap_base_max_segments = MCA_MEMHEAP_MAX_SEGMENTS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's rename it (and respective configuration) to something like mca_memheap_num_segments_warn
since it's not a hard limit anymore, but just threshold for showing a warning
0a8c5e3 to
6dc1101
Compare
| mca_base_var_register("oshmem", "memheap", "base", "max_segments", | ||
| "Maximum number of shared data segments per process", | ||
| "Display a warning if the number of segments of " | ||
| "shared heap exceeds this value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor:
Display a warning if the number of segments of the shared memheap exceeds this value"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-worded
|
@hoopoepg pls squash |
- enabled dynamic segment's info allocation - removed hard limit for segments count - fixed error handling on no enough memory - refactpring for static segments initialization: removed middle segment allocation - fixed poterntial issue in spml_ucx - incorrect error handling which may lead to crash Signed-off-by: Sergey Oblomov <[email protected]>
b516a4d to
1b26bbd
Compare