Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 56 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,59 @@
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 1 - Flocking**
![](images/top_image.gif)
# Project 1 - Flocking
**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 1 - Flocking**

* (TODO) YOUR NAME HERE
* (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Zheyuan Xie
* [LinkedIn](https://www.linkedin.com/in/zheyuan-xie)
* [Github](https://github.com/ZheyuanXie)
* [Personal Website](http://errorspace.cn)
* Tested on: Windows 10 Pro, i7-7700HQ @ 2.80GHz 2.80GHz, 16GB, GTX 1050 2GB (Dell XPS 15 9560)

### (TODO: Your README)
## Screenshots
| 5,000 Boids | 100,000 Boids |
|--|--|
|![Number of boids: 5k](images/5k_compressed.gif) | ![Number of boids: 100k](images/100k_compressed.gif) |

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
## Performance Analysis
### 1. Number of Boids
The figure below shows the relationship between performance and boid population. We can observe:
- For all three methods, performance decrease as the number of boids increase.
- The naive method has the best performance when the number of boids are small (below 1.5k), but it decays rapidly as the number of boids increase.
- The coherent uniform grid has the best performance when the number of boids are large.
- Visualization generally decrease the framerate.
- When the number of boids is small, visualization is the performance bottleneck; When the number of boids is large, computation takes over.

![](images/population_fps.jpg)

In the naive method, for each boid we check all other N-1 boids. The time complexity is directly `O(N^2)`. The uniform grid methods is less sensitive to the increase in boids since for each boid we only check its neighboring boids within some distance.

### 2. Block Size
The figure below shows the performance vs block size curve with visualization turned off. We can observe:

- For all three methods, a jump in performance is observed when blocksize is increased from 16 to 32.
- Further increasing the blocksize from 32 to 512 has no significant impact in performance.

![](images/blocksize_fps.jpg)

This is because the warp size is always 32. Having less then 32 threads per block will result in inactive threads in a warp. But further increase the block size beyond 16 will no longer increase the degree of parallelism.



### 3. Coherent Uniform Grid
Compared with scattered uniform grid, coherent uniform grid has better performance. The performance gap becomes wider as the number of boids increase. This outcome is expected since:
- We have one less level of indrection, and therefore one less access (`dev_particleArrayIndices`) to the device's global memory.
- Reshuffling the `dev_pos` and `dev_vel1` allows boids in the same cell occupy contigous memory, increase the chance of cache hit.

### 4. Cell Width & Number of Neighboring Cells
Using half the cell width slightly increase the performance.

||8 Cells, Full Width| 27 Cells, Half Width
|--|--|--|
|5k Boids|1256 FPS|1366 FPS|
|10k Boids|798 FPS |1001 FPS|
|50k Boids|139 FPS |151 FPS|

Though the number of cells needs to be check is larger, the actual volume that is checked is smaller, and therefore the number of boids checked is smaller. Let's say the neighborhood distance is 1 unit:
- Checking 8 Cells with 2-unit width is equivalent to checking a volume of 4\*4\*4=64 units.
- Checking 27 Cells with 1-unit width is equivalent to checking a volume of 3\*3\*3=27 units.

Therefore checking 27 cells with half the cell width means checking less boids and faster computation.
Binary file added images/100k.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/100k_compressed.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/5k_compressed.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions images/blocksize_curve.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
number_of_boids = [16 32 64 128 256 512];
naive = [16.2377 31.1301 39.1084 36.5137 38.2966 38.1733];
scatter = [390.202 437.727 433.237 427.975 435.234 430.058];
coherent = [670.605 913.866 920.439 940.009 928.573 916.015];
plot(number_of_boids, naive, '.-', 'lineWidth', 1); hold on
plot(number_of_boids,scatter, '.-', 'lineWidth', 1);
plot(number_of_boids,coherent, '.-', 'lineWidth', 1);
legend('naive method', 'scatter grid', 'coherent grid')
xlabel('Blocksize')
ylabel('FPS')
title('Blocksize vs. Boid Population')
Binary file added images/blocksize_fps.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
19 changes: 19 additions & 0 deletions images/population_curve.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
number_of_boids = [1 5 10 25 50 75 100];
naivev = [154.764 147.047 72.72 15.5 4.01 1.85 1.00];
scatterv = [133.819 152.306 152.116 150.294 103.368 60.0672 43.5686];
coherentv = [150.787 150.473 166.157 150.463 170.284 123.162 157.798];
naive = [1725 448 118 23.5 6.5 3 1.6];
scatter = [1470 1172.5 800 360 133.225 70.5 47.332];
coherent = [1441 1210 1164 760 402 275 212.1];

plot(number_of_boids, naivev, '--b', 'lineWidth', 1, 'Color','#0072BD'); hold on
plot(number_of_boids, naive, '-b', 'lineWidth', 1 ,'Color', '#0072BD');
plot(number_of_boids,scatterv, '--m', 'lineWidth', 1, 'Color','#D95319');
plot(number_of_boids,scatter, '-m', 'lineWidth', 1 , 'Color','#D95319');
plot(number_of_boids,coherentv, '--r', 'lineWidth', 1, 'Color', '#77AC30');
plot(number_of_boids,coherent, '-r', 'lineWidth', 1,'Color','#77AC30');

legend('naive (vis)', 'naive (no vis)', 'scatter (vis)', 'scatter (no vis)', 'coherent (vis)', 'coherent (no vis)')
xlabel('Number of Boids (Thousands)')
ylabel('FPS')
title('FPS vs. Boid Population')
Binary file added images/population_fps.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/top_image.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading