Skip to content

Latest commit

 

History

History
56 lines (52 loc) · 3.01 KB

Overview.md

File metadata and controls

56 lines (52 loc) · 3.01 KB

Results Overview

See also the samples overview.

For four of the samples there is a performance analysis available. The remaining examples are not meant for performance comparisons, but rather to show how to use Hybrid Fortran.

Name Performance Results Speedup HF on 6 Core vs. 1 Core [1] Speedup HF on GPU vs 6 Core [1] Speedup HF on GPU vs 1 Core [1]
3D Diffusion Link 1.06x
Compare Performance
10.94x
Compare Performance
Compare Speedup
11.66x
Particle Push Link 9.08x
Compare Performance
21.72x
Compare Performance
Compare Speedup
152.79x
Poisson on FEM Solver with Jacobi Approximation Link 1.41x 5.13x 7.28x
MIDACO Ant Colony Solver with MINLP Example Link 5.26x 10.07x 52.99x

[1]: If available, comparing to reference C version, otherwise comparing to Hybrid Fortran CPU implementation. Kepler K20x has been used as GPU, Westmere Xeon X5670 has been used as CPU (TSUBAME 2.5). All results measured in double precision. The CPU cores have been limited to one socket using thread affinity 'compact' with 12 logical threads. For CPU, Intel compilers ifort / icc with '-fast' setting have been used. For GPU, PGI compiler with '-fast' setting and CUDA compute capability 3.x has been used. All GPU results include the memory copy time from host to device.