Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite Point data structures as generic SoAs #74

Merged
merged 5 commits into from
Jan 14, 2025

Conversation

sbaldu
Copy link
Collaborator

@sbaldu sbaldu commented Jan 10, 2025

No description provided.

self.ppbin, data, self.clust_data.weight,
self.kernel, self.clust_data.n_dim, block_size,
device_id)
cluster_id_is_seed = gpu_hip.mainRun(self.dc_, self.rhoc, self.dm, self.ppbin,

Check warning

Code scanning / Prospector (reported by Codacy)

local variable 'cluster_id_is_seed' is assigned to but never used (F841) Warning

local variable 'cluster_id_is_seed' is assigned to but never used (F841)
alpaka::memcpy(
queue_,
clue::make_host_view(h_points.clusterIndexes(), 2 * nPoints),
clue::make_device_view(device, d_points.view()->cluster_index, 2 * nPoints),

Check warning

Code scanning / Flawfinder (reported by Codacy)

Does not check for buffer overflows when copying to destination (CWE-120). Make sure destination can always hold the source data. Warning

Does not check for buffer overflows when copying to destination (CWE-120). Make sure destination can always hold the source data.
PointsSoA<2> points(floatBuffer.data(), intBuffer.data(), PointShape<2>{n});

SUBCASE("Check that the content of the SoA is the same as the buffer") {
CHECK(std::equal(floatBuffer.begin(), floatBuffer.end(), points.coords()));

Check warning

Code scanning / Flawfinder (reported by Codacy)

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it. Warning test

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it.

SUBCASE("Check that the content of the SoA is the same as the buffer") {
CHECK(std::equal(floatBuffer.begin(), floatBuffer.end(), points.coords()));
CHECK(std::equal(intBuffer.begin(), intBuffer.end(), points.clusterIndexes()));

Check warning

Code scanning / Flawfinder (reported by Codacy)

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it. Warning test

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it.
});

// check content
CHECK(std::equal(floatBuffer.begin(), floatBuffer.end(), points.coords()));

Check warning

Code scanning / Flawfinder (reported by Codacy)

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it. Warning test

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it.

// check content
CHECK(std::equal(floatBuffer.begin(), floatBuffer.end(), points.coords()));
CHECK(std::equal(floatBuffer.begin(), floatBuffer.end(), view->coords));

Check warning

Code scanning / Flawfinder (reported by Codacy)

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it. Warning test

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it.
// check content
CHECK(std::equal(floatBuffer.begin(), floatBuffer.end(), points.coords()));
CHECK(std::equal(floatBuffer.begin(), floatBuffer.end(), view->coords));
CHECK(std::equal(intBuffer.begin(), intBuffer.end(), points.clusterIndexes()));

Check warning

Code scanning / Flawfinder (reported by Codacy)

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it. Warning test

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it.
CHECK(std::equal(floatBuffer.begin(), floatBuffer.end(), points.coords()));
CHECK(std::equal(floatBuffer.begin(), floatBuffer.end(), view->coords));
CHECK(std::equal(intBuffer.begin(), intBuffer.end(), points.clusterIndexes()));
CHECK(std::equal(intBuffer.begin(), intBuffer.end(), view->clusterIndexes));

Check warning

Code scanning / Flawfinder (reported by Codacy)

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it. Warning test

Function does not check the second iterator for over-read conditions (CWE-126). This function is often discouraged by most C++ coding standards in favor of its safer alternatives provided since C++14. Consider using a form of this function that checks the second iterator before potentially overflowing it.
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cppcheck (reported by Codacy) found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

results = np.vstack([np.zeros(npoints, dtype=np.int32), # cluster ids
np.zeros(npoints, dtype=np.int32)], # is_seed
dtype=np.int32)
coords = np.ascontiguousarray(coords, dtype=np.float32) # if already contiguous this is a no-op

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Line too long (108/100) Warning

Line too long (108/100)
results = np.vstack([np.zeros(npoints, dtype=np.int32), # cluster ids
np.zeros(npoints, dtype=np.int32)], # is_seed
dtype=np.int32)
coords = np.ascontiguousarray(coords, dtype=np.float32) # if already contiguous this is a no-op

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Line too long (104/100) Warning

Line too long (104/100)
results = np.vstack([np.zeros(npoints, dtype=np.int32), # cluster ids
np.zeros(npoints, dtype=np.int32)], # is_seed
dtype=np.int32)
coords = np.ascontiguousarray(coords, dtype=np.float32) # if already contiguous this is a no-op

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Line too long (108/100) Warning

Line too long (108/100)
ndim = len(input_data[:-1])
coords = np.vstack([input_data[:-1], # coordinates SoA
input_data[-1]], # weights
dtype=np.float32)

Check warning

Code scanning / Pylint (reported by Codacy)

Wrong continued indentation (remove 1 space). Warning

Wrong continued indentation (remove 1 space).
dtype=np.float32)
results = np.vstack([np.zeros(npoints, dtype=np.int32), # cluster ids
np.zeros(npoints, dtype=np.int32)], # is_seed
dtype=np.int32)

Check warning

Code scanning / Pylint (reported by Codacy)

Wrong continued indentation (remove 1 space). Warning

Wrong continued indentation (remove 1 space).
results = np.vstack([np.zeros(npoints, dtype=np.int32), # cluster ids
np.zeros(npoints, dtype=np.int32)], # is_seed
dtype=np.int32)
coords = np.ascontiguousarray(coords, dtype=np.float32) # if already contiguous this is a no-op

Check warning

Code scanning / Pylint (reported by Codacy)

Line too long (104/100) Warning

Line too long (104/100)
ndim = input_data[0][0]
coords = np.vstack([input_data[:-1].T, # coordinates SoA
input_data[-1]], # weights
dtype=np.float32)

Check warning

Code scanning / Pylint (reported by Codacy)

Wrong continued indentation (remove 1 space). Warning

Wrong continued indentation (remove 1 space).
dtype=np.float32)
results = np.vstack([np.zeros(npoints, dtype=np.int32), # cluster ids
np.zeros(npoints, dtype=np.int32)], # is_seed
dtype=np.int32)

Check warning

Code scanning / Pylint (reported by Codacy)

Wrong continued indentation (remove 1 space). Warning

Wrong continued indentation (remove 1 space).
assert len(coords_x0[i]) == 1
assert len(coords_x1[i]) == 1
assert len(c.clust_data.coords[0]) == 999
coordsoa_x0 = c._partial_dimension_dataset([0])

Check warning

Code scanning / Prospector (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class (protected-access) Warning test

Access to a protected member _partial_dimension_dataset of a client class (protected-access)
assert len(coords_x1[i]) == 1
assert len(c.clust_data.coords[0]) == 999
coordsoa_x0 = c._partial_dimension_dataset([0])
coordsoa_x1 = c._partial_dimension_dataset([1])

Check warning

Code scanning / Prospector (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class (protected-access) Warning test

Access to a protected member _partial_dimension_dataset of a client class (protected-access)
assert len(coords_x1[i]) == 1
assert len(coords_x2[i]) == 1
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0 = c._partial_dimension_dataset([0])

Check warning

Code scanning / Prospector (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class (protected-access) Warning test

Access to a protected member _partial_dimension_dataset of a client class (protected-access)
assert len(coords_x2[i]) == 1
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0 = c._partial_dimension_dataset([0])
coordsoa_x1 = c._partial_dimension_dataset([1])

Check warning

Code scanning / Prospector (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class (protected-access) Warning test

Access to a protected member _partial_dimension_dataset of a client class (protected-access)
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0 = c._partial_dimension_dataset([0])
coordsoa_x1 = c._partial_dimension_dataset([1])
coordsoa_x2 = c._partial_dimension_dataset([2])

Check warning

Code scanning / Prospector (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class (protected-access) Warning test

Access to a protected member _partial_dimension_dataset of a client class (protected-access)
assert len(coords_x0x2) == c.clust_data.n_points
assert len(coords_x1x2) == c.clust_data.n_points
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0x1 = c._partial_dimension_dataset([0, 1])

Check warning

Code scanning / Prospector (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class (protected-access) Warning test

Access to a protected member _partial_dimension_dataset of a client class (protected-access)
assert len(coords_x1x2) == c.clust_data.n_points
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0x1 = c._partial_dimension_dataset([0, 1])
coordsoa_x0x2 = c._partial_dimension_dataset([0, 2])

Check warning

Code scanning / Prospector (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class (protected-access) Warning test

Access to a protected member _partial_dimension_dataset of a client class (protected-access)
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0x1 = c._partial_dimension_dataset([0, 1])
coordsoa_x0x2 = c._partial_dimension_dataset([0, 2])
coordsoa_x1x2 = c._partial_dimension_dataset([1, 2])

Check warning

Code scanning / Prospector (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class (protected-access) Warning test

Access to a protected member _partial_dimension_dataset of a client class (protected-access)
c2 = clue.clusterer(1., 2., 1.6)
c2.read_data(box)
c2.run_clue(dimensions=[0, 1])

# check that the result of clustering the 3D dataset using only
# two dimensions is the same as clustering the 2D dataset
assert c1.clust_prop == c2.clust_prop

if __name__ == "__main__":

Check warning

Code scanning / Prospector (reported by Codacy)

expected 2 blank lines after class or function definition, found 1 (E305) Warning test

expected 2 blank lines after class or function definition, found 1 (E305)
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bandit (reported by Codacy) found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

assert len(coords_x0[i]) == 1
assert len(coords_x1[i]) == 1
assert len(c.clust_data.coords[0]) == 999
coordsoa_x0 = c._partial_dimension_dataset([0])

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x1[i]) == 1
assert len(c.clust_data.coords[0]) == 999
coordsoa_x0 = c._partial_dimension_dataset([0])
coordsoa_x1 = c._partial_dimension_dataset([1])

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x1[i]) == 1
assert len(coords_x2[i]) == 1
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0 = c._partial_dimension_dataset([0])

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x2[i]) == 1
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0 = c._partial_dimension_dataset([0])
coordsoa_x1 = c._partial_dimension_dataset([1])

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0 = c._partial_dimension_dataset([0])
coordsoa_x1 = c._partial_dimension_dataset([1])
coordsoa_x2 = c._partial_dimension_dataset([2])

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x0x2) == c.clust_data.n_points
assert len(coords_x1x2) == c.clust_data.n_points
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0x1 = c._partial_dimension_dataset([0, 1])

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x1x2) == c.clust_data.n_points
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0x1 = c._partial_dimension_dataset([0, 1])
coordsoa_x0x2 = c._partial_dimension_dataset([0, 2])

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0x1 = c._partial_dimension_dataset([0, 1])
coordsoa_x0x2 = c._partial_dimension_dataset([0, 2])
coordsoa_x1x2 = c._partial_dimension_dataset([1, 2])

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
@@ -131,11 +122,17 @@
c1 = clue.clusterer(1., 2., 1.6)
c1.read_data(square)
c1.run_clue()

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Trailing whitespace Warning test

Trailing whitespace
n_dim : int
n_points : int

def __init__(self, coords = None, results = None, n_dim = None, n_points = None):

Check warning

Code scanning / Pylint (reported by Codacy)

No space allowed around keyword argument assignment Warning

No space allowed around keyword argument assignment
assert len(coords_x0[i]) == 1
assert len(coords_x1[i]) == 1
assert len(c.clust_data.coords[0]) == 999
coordsoa_x0 = c._partial_dimension_dataset([0])

Check notice

Code scanning / Pylint (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x1[i]) == 1
assert len(c.clust_data.coords[0]) == 999
coordsoa_x0 = c._partial_dimension_dataset([0])
coordsoa_x1 = c._partial_dimension_dataset([1])

Check notice

Code scanning / Pylint (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x1[i]) == 1
assert len(coords_x2[i]) == 1
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0 = c._partial_dimension_dataset([0])

Check notice

Code scanning / Pylint (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x2[i]) == 1
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0 = c._partial_dimension_dataset([0])
coordsoa_x1 = c._partial_dimension_dataset([1])

Check notice

Code scanning / Pylint (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x0x2) == c.clust_data.n_points
assert len(coords_x1x2) == c.clust_data.n_points
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0x1 = c._partial_dimension_dataset([0, 1])

Check notice

Code scanning / Pylint (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(coords_x1x2) == c.clust_data.n_points
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0x1 = c._partial_dimension_dataset([0, 1])
coordsoa_x0x2 = c._partial_dimension_dataset([0, 2])

Check notice

Code scanning / Pylint (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
assert len(c.clust_data.coords[0]) == 10000
coordsoa_x0x1 = c._partial_dimension_dataset([0, 1])
coordsoa_x0x2 = c._partial_dimension_dataset([0, 2])
coordsoa_x1x2 = c._partial_dimension_dataset([1, 2])

Check notice

Code scanning / Pylint (reported by Codacy)

Access to a protected member _partial_dimension_dataset of a client class Note test

Access to a protected member _partial_dimension_dataset of a client class
@@ -131,11 +122,17 @@
c1 = clue.clusterer(1., 2., 1.6)
c1.read_data(square)
c1.run_clue()

Check warning

Code scanning / Pylint (reported by Codacy)

Trailing whitespace Warning test

Trailing whitespace
c2 = clue.clusterer(1., 2., 1.6)
c2.read_data(box)
c2.run_clue(dimensions=[0, 1])

# check that the result of clustering the 3D dataset using only
# two dimensions is the same as clustering the 2D dataset
assert c1.clust_prop == c2.clust_prop

if __name__ == "__main__":
c = clue.clusterer(1., 2., 1.6)

Check warning

Code scanning / Pylint (reported by Codacy)

Constant name "c" doesn't conform to UPPER_CASE naming style Warning test

Constant name "c" doesn't conform to UPPER_CASE naming style
sbaldu and others added 4 commits January 14, 2025 15:37
Update backend code

Update binding modules

Update python API

Add tests for host-side point SoA

Update run_clue for CUDA and HIP

Update benchmarking scripts

Move test folder

Fix device memory access in memcpy
* Add `import_clusterer` method

* Add test for new clusterer import

* Add docstring

* Fix "if main" of test file

* Addition to gitignore
* Small fix in getGlobalBin

* Update version

* Formatting

Fix after merge
Fix partial dimensional clustering

Update package version
@sbaldu sbaldu force-pushed the feature-numpy_coords branch from 00d9725 to 1f934a3 Compare January 14, 2025 14:40
@sbaldu sbaldu merged commit c6531a5 into cms-patatrack:main Jan 14, 2025
11 of 14 checks passed
sbaldu added a commit to sbaldu/CLUEstering that referenced this pull request Jan 15, 2025
* Rework points as SoA

Update backend code

Update binding modules

Update python API

Add tests for host-side point SoA

Update run_clue for CUDA and HIP

Update benchmarking scripts

Move test folder

Fix device memory access in memcpy

* Feature clusterer import method (cms-patatrack#76)

* Add `import_clusterer` method

* Add test for new clusterer import

* Add docstring

* Fix "if main" of test file

* Addition to gitignore

* Small fix in getGlobalBin (cms-patatrack#75)

* Small fix in getGlobalBin

* Update version

* Formatting

Fix after merge

* Separate alpaka input and result buffers

Fix partial dimensional clustering

Update package version
@sbaldu sbaldu deleted the feature-numpy_coords branch February 11, 2025 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant