POC: neighbor search #61

raimis · 2022-03-03T17:42:29Z

This is a proof-of-concept. DO NOT MERGER!

peastman · 2022-04-25T17:38:34Z

You've structured the kernel so that every thread computes only a single interaction:

    const int32_t index = blockIdx.x * blockDim.x + threadIdx.x;
    if (index >= num_all_pairs) return;

    int32_t row = floor((sqrtf(8 * index + 1) + 1) / 2);
    if (row * (row - 1) > 2 * index) row--;
    const int32_t column = index - row * (row - 1) / 2;

    const scalar_t delta_x = positions[row][0] - positions[column][0];
    const scalar_t delta_y = positions[row][1] - positions[column][1];
    const scalar_t delta_z = positions[row][2] - positions[column][2];

Usually it's better to use a smaller number of thread blocks and have each thread loop over interactions. For one thing, there's overhead to each thread block. For another, it allows lots of optimization. In the above code, if you can arrange that each thread will compute multiple pairs all in the same row, then you can skip the row and column computations, and also you only need to load positions[row] once.

Of course, it all depends what size you're optimizing for. With 50 atoms, the number of pairs is much too small to fill a large GPU even with only one pair per thread. For larger systems with thousands of atoms and millions of pairs, it will make more of a difference.

raimis · 2022-05-18T14:06:27Z

This PR is discontinued. The code is being move to NNPOps (openmm/NNPOps#58)

raimis · 2024-01-04T15:38:21Z

This is obsolted

Initial implementation of the neighbor operation

22da0c8

raimis self-assigned this Mar 3, 2022

Raimondas Galvelis added 17 commits March 3, 2022 18:47

Add a demo

4bb5e19

Implement the backward pass

e1255b9

Update the demo

1165d8a

Impement the neighbor search for CPU

d855a1e

Fix bugs

0727d9d

Update the demo

7ad882e

Simplify

d7d27a6

Add more consts

adb8566

Compute just the unique neighor pairs

c7b0456

Added tests

4700f46

Fix indexing for large molecules

6e96dd4

Fix the number of blocks

d8be418

Extend the tests for large molecules

ee8bc28

Improve JIT

a4dedfe

Fix setup.py

846ad58

Fix setup.py again

5c4cc57

Add a C++ compiler as a dependency

13a2056

claudi mentioned this pull request Mar 29, 2022

Speed-up neighbors calculation #68

Closed

Fix the list of source files

e90b289

raimis force-pushed the poc_neighbors branch from 6e71472 to e90b289 Compare March 30, 2022 17:09

Raimondas Galvelis added 8 commits April 1, 2022 14:24

Move the neighbor tests

8ffbc18

Simplify the code

eeef29e

Change the function name

721b2aa

Fix the backward pass

3bb926c

Fix varialbe names

61bdb36

Pass cutoff and max neighbors as arguments

d39dce9

Support the cutoff in the CUDA kernels

688d958

Rename variables

88f4ae2

Raimondas Galvelis added 2 commits April 11, 2022 14:56

Implement the faster kernel

0892de7

Optimize the backward kernel

bf9a1cf

This was referenced Apr 20, 2022

POC: the graph network optimization with the custom kernels #74

Closed

Optimization of the graph network #48

Open

This was referenced May 17, 2022

Periodic boundary conditions #92

Closed

Nearest neighbor operation openmm/NNPOps#58

Merged

RaulPPelaez mentioned this pull request May 3, 2023

Adding a cell list neighbor list module #169

Merged

17 tasks

peastman mentioned this pull request May 5, 2023

Neigbors for >128K particles openmm/NNPOps#101

Open

raimis marked this pull request as draft May 17, 2023 13:22

raimis closed this Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: neighbor search #61

POC: neighbor search #61

raimis commented Mar 3, 2022 •

edited

Loading

peastman commented Apr 25, 2022

raimis commented May 18, 2022

raimis commented Jan 4, 2024

POC: neighbor search #61

POC: neighbor search #61

Conversation

raimis commented Mar 3, 2022 • edited Loading

peastman commented Apr 25, 2022

raimis commented May 18, 2022

raimis commented Jan 4, 2024

raimis commented Mar 3, 2022 •

edited

Loading