Skip to content

Commit 138e04f

Browse files
committed
update README and pom
1 parent 96aeddf commit 138e04f

File tree

2 files changed

+13
-16
lines changed

2 files changed

+13
-16
lines changed

README.md

+10-13
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,19 @@
1-
# Pointer-jumping on a GPU using Aparapi
1+
# GPU samples
22

3-
Classic [pointer jumping](https://en.wikipedia.org/wiki/Pointer_jumping) algorithm summarizing values from an array adapted to run on a GPU instead of a PRAM using [Aparapi](https://aparapi.com/).
3+
Parallel reduction and [pointer jumping](https://en.wikipedia.org/wiki/Pointer_jumping) algorithms summarizing values from an array adapted to run on a GPU using [Aparapi](https://aparapi.com/) and [JOCL](http://www.jocl.org/) (frontends to [openCL](https://www.khronos.org/opencl/)).
44

5-
On my integrated Intel GPU (maxWorkWorkGroupSize=256, maxComputeUnits=48), this works 4 times slower than sequential adding on the CPU.
65

7-
This is probably because the algorithm is memory bound and spends most time on fetching values from the memory. See [this SO answer](https://stackoverflow.com/questions/22866901/using-java-with-nvidia-gpus-cuda#22868938) for more info.
6+
## building and running comparison of various sync methods in openCL parallel reduction
87

9-
The other reason may be that [intel has only 16 barrier registers (and only 64kB local memory _shared_ among running work-groups)](https://software.intel.com/content/www/us/en/develop/documentation/iocl-opg/top/optimizing-opencl-usage-with-intel-processor-graphics/work-group-size-recommendations-summary.html), so only up to 16 work-groups can run in parallel.
10-
11-
To test this theory I need to run this on an Nvidia or AMD GPU, but I don't have any at hand. If someone who has, could run this code and send me back the results, I'd be very grateful :)
12-
13-
## building and running
14-
15-
First, make sure that you have an openCL driver for your GPU installed: [Nvidia](https://developer.nvidia.com/cuda-downloads), [AMD Linux](https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-21-30) (AMD on windows should be available by default).
8+
First, make sure that you have an openCL driver for your GPU installed: [Nvidia](https://developer.nvidia.com/cuda-downloads), [AMD Linux](https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-21-30) (AMD on windows should be available by default, hopefully).
169

1710
```bash
1811
mvn clean package
19-
java -jar target/pointer-jumping-gpu-1.0-SNAPSHOT-jar-with-dependencies.jar
12+
java -jar target/pointer-jumping-gpu-1.0-SNAPSHOT-jar-with-dependencies.jar $[32*1024*1024] 50
2013
```
2114

22-
Thanks!
15+
on my integrated intel GPU I get times similar to these:<pre>
16+
BARRIER average: 101806901
17+
SIMD average: 102234318
18+
HYBRID average: 95539077
19+
CPU average: 41322452</pre>

pom.xml

+3-3
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@
44
<modelVersion>4.0.0</modelVersion>
55

66
<groupId>pl.morgwai.samples</groupId>
7-
<artifactId>pointer-jumping-gpu</artifactId>
7+
<artifactId>gpu-samples</artifactId>
88
<version>1.0-SNAPSHOT</version>
99

10-
<name>Java Sample</name>
10+
<name>GPU samples with jocl and aparapi</name>
1111

1212
<properties>
1313
<maven.compiler.source>11</maven.compiler.source>
@@ -42,7 +42,7 @@
4242
<configuration>
4343
<archive>
4444
<manifest>
45-
<mainClass>pl.morgwai.samples.aparapi.PointerJumpingKernel</mainClass>
45+
<mainClass>pl.morgwai.samples.jocl.ParallelReductionKernel</mainClass>
4646
</manifest>
4747
</archive>
4848
<descriptorRefs>

0 commit comments

Comments
 (0)