Optimize K-Truss#4375
Conversation
…6_optimize-k-truss
ChuckHastings
left a comment
There was a problem hiding this comment.
So much simpler. A few cleanup comments, otherwise I think it's looking pretty good.
|
|
||
| size_t edges_to_intersect_per_iteration = | ||
| static_cast<size_t>(handle.get_device_properties().multiProcessorCount) * (1 << 17); | ||
| static_cast<size_t>(handle.get_device_properties().multiProcessorCount) * (1 << 13); |
There was a problem hiding this comment.
Is there a reason for 13? Should this be some sort of named constant so it's easier to change?
There was a problem hiding this comment.
It is a good idea to expose it in order to easily control the chunk size. I temporarily set a lower value so that I can process a larger graph when running at scale
| @@ -0,0 +1,41 @@ | |||
| /* | |||
| * Copyright (c) 2023-2024, NVIDIA CORPORATION. | |||
| @@ -0,0 +1,41 @@ | |||
| /* | |||
| * Copyright (c) 2023-2024, NVIDIA CORPORATION. | |||
cpp/src/community/k_truss_impl.cuh
Outdated
|
|
||
| std::tie(edgelist_srcs, edgelist_dsts, edgelist_wgts, num_triangles, std::ignore) = | ||
| decompress_to_edgelist( | ||
| std::chrono::seconds s (0); // 1 second |
There was a problem hiding this comment.
Two thoughts...
- Have you looked at https://github.com/rapidsai/cugraph/blob/branch-24.08/cpp/include/cugraph/utilities/high_res_timer.hpp which I think you could just include and use rather than reinventing. There are features to manage multiple timers inside that as well
- Probably should remove these before we merge the PR
There was a problem hiding this comment.
I only wanted to time the triangle count part and I got a compiler error when copying the high_res_timer timing code from our tests so I didn't spend much time understanding what was wrong. But I will look into it again.
Probably should remove these before we merge the PR
I only have them for benchmarking on Draco. I will remove them in my next commit today along with all the unused functors
| struct KTruss_Usecase { | ||
| int32_t k_{3}; | ||
| bool test_weighted_{false}; | ||
| // FIXME: test edge mask |
There was a problem hiding this comment.
Any reason we are skipping this? (edge mask)
There was a problem hiding this comment.
Oh I am not. I just forgot to remove this outdated comment
| mg_graph_view.vertex_partition_range_lasts()); | ||
|
|
||
| auto global_d_cugraph_srcs = cugraph::test::device_gatherv( | ||
| *handle_, raft::device_span<vertex_t const>(d_cugraph_srcs.data(), d_cugraph_srcs.size())); |
There was a problem hiding this comment.
Sorry for additional nitpicking, but it might be better to follow naming convention of K-core tests (https://github.com/rapidsai/cugraph/blob/branch-24.08/cpp/tests/cores/mg_k_core_test.cpp#L136). Later, if this pattern occurs frequent enough, we can create another utility funciton.
|
|
||
| if constexpr (multi_gpu) { | ||
| std::tie(srcs, dsts, std::ignore, std::ignore, std::ignore) = | ||
| std::tie(srcs, dsts, std::ignore, std::ignore, std::ignore, std::ignore) = |
There was a problem hiding this comment.
You can defer this to the next MG optimization PR, but you can use masking here instead of creating a new graph for excluding self-loops (see the example here. https://github.com/rapidsai/cugraph/blob/branch-24.08/cpp/src/community/triangle_count_impl.cuh#L351).
cpp/src/community/k_truss_impl.cuh
Outdated
| *vertex_partition_range_lasts); | ||
| } | ||
| renumber_map = std::move(tmp_renumber_map); | ||
|
|
There was a problem hiding this comment.
You can use masking instead of creating a graph except for the final DODG graph creation (this DODG graph will be used again & again so better create a new graph than using masking as masking has its own overhead).
| while (true) { | ||
|
|
||
| auto edge_triangle_counts = | ||
| edge_triangle_count<vertex_t, edge_t, multi_gpu>(handle, cur_graph_view); |
There was a problem hiding this comment.
Add a FIXME statement here.
This approach will miserably fail if we need to go through many iterations and only a small number of edges are invalidated in each iteration. No need to address this if this doesn't happen for any practical graphs but if we find one, we can work on optimization focusing on that graph.
…6_optimize-k-truss
d96081c to
3743208
Compare
|
/merge |
This PR leverages our existing edge triangle count to implement both SG and MG K-Truss as our initial version.
This PR also:
rx countin several as our shuffling functionsCloses #4500