Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 77 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,82 @@
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 1 - Flocking**

* (TODO) YOUR NAME HERE
* (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Yan Dong
- [LinkedIn](https://www.linkedin.com/in/yan-dong-572b1113b/)
- [personal website](coffeier.com)
- [github](https://github.com/coffeiersama)
* Tested on: Windows 10, i7-8750 @ 2.22GHz (12CPUs) 16GB, GTX 1060 14202MB (OMEN 15-dc0xxx)

### (TODO: Your README)
------

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
[Result](#boid-result) - [Rules](#rules) - [Runtime Analysis](#runtime-analysis) - [Responses](#questions)

------

## Boid Result

10,000 boids

![](images/1_10000.gif)

150,000 boids

![](images/3_150000.gif)



## Rules

#### Naive

For each boid, search all other boids, find the boid that is enough near to the current boid, and do the calculating in rules 1,2 and 3. This is slow when we deal with a large number of boids because we need to ask 'boids_count' numbers for each boid. There is no select and filter.

#### Uniform Grid

Making buffers to store particle index for each grid cell. Control the relationship between cell width and max_distance(which is also our search distance) to let the approach be more flexible approach. (seen in my codes when define max_search)

Avoid of hard-coding a search of the designated area.(this part can be seen when define my_8).

#### Coherent Grid

Making new position and velocity storage buffer. Assign the value according to the particle index after sort by grid index. Save time in getting position and velocity index.



## Runtime Analysis

When we increase the boids count, the fps for both 3 methods decrease. It's hard for Naive method to support more than 20k thousand boids, but it works well when the boids number is a small. Using uniform grid and coherent grid to just search in valid neighbors, we successfully reduce some comparing work. Coherent grid can even support more than 300k thousand boids with a good FPS. Notice that, for grid method, before going down, there is a rise when we increase boids count from a small number. I think this is because the the two method use more buffer, read and write buffer takes more part of time when the boids count is small.

![](images/bc_visual2.png)

As for increasing the blocksize, here is the line chart. My boids count is 2w, and increasing the blocksize do not let the fps go up, to the opposite, it turns a little bit down in the coherent grid case.

![](images/bs_visual.png)

Making the boids show in the window do impact the performance, I compare the 2 situation(with and without visual) and here is the result. Well I think the shapes of lines in two charts are similar. That's may describe the algorithm performance itself. Without visual, Naive is even the best method when the boids count is under 0.5k.

![](images/upbc_compare.png)

For increasing blocksize:

![](images/upbs_compare.png)

## Questions

##### For each implementation, how does changing the number of boids affect performance? Why do you think this is?

Increasing the number of boids, usually causes the fps to decrease. This is reasonable, since there are more boids in the space and we need to search neighbors, compare distance for much more time. For Naive, the fps decrease rapidly in 1k-10k, and for Uniform Grid and Coherent Grid, the fps decrease mostly in after 20k. That's reasonable for Naive is suitable for a small boids count, Uni and Coherent is better in large boids count.

There is a little strange that in my testing data, if starting from a little number, increasing boids makes the fps a little bit increase. I think this is because, first, the fps is unstable, it can be vibrate for around +-50 or +-80, and I also notice that if you test twice in a long time interval, you get the different fps. Also, different initial position can makes a different result.

##### For each implementation, how does changing the block count and block size affect performance? Why do you think this is?

Changing block count and block size do not affect much in my testing (based on the vibrated fps, this small changes means almost no effect...). Maybe when I make more threads and use up the global memory, and takes some local memory so the other work is affected?

##### For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?

Yes. What coherent uniform grid focus on is avoid read data in particle_array to get the position and velocity data for each time in searching neighbors. This method truly improve the fps for about 100.

##### Did changing cell width and checking 27 vs 8 neighboring cells affect performance? Why or why not? Be careful: it is insufficient (and possibly incorrect) to say that 27-cell is slower simply because there are more cells to check!

Yes. Of course 27-cell means more checking and calculating than 8-cell, so when using 27-cell, it can be slower. But, the search zone is not change in our algorithm, when the cell becomes smaller, there are more cells in the space. Besides, we do not search in a circle zone, all cells intersect with the searching circle need to check. So making more cells can specify the space for us better, and save some time. In my test, when the boids count is 250k, 27-cell is better, when 150k or small, 8-cell is better.
Binary file added images/1_10000.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/2_50000.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/3_150000.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/bc_novisual.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/bc_visual.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/bc_visual2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/bs_visual.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/cor_vnov.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/upbc_compare.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/upbs_compare.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading