Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert basic parallel_for kernel launches to nd_range #46

Open
18 tasks
kris-rowe opened this issue May 1, 2023 · 0 comments
Open
18 tasks

Convert basic parallel_for kernel launches to nd_range #46

kris-rowe opened this issue May 1, 2023 · 0 comments

Comments

@kris-rowe
Copy link

kris-rowe commented May 1, 2023

Details

The work group size chosen by the SYCL runtime for the basic parallel_for kernel launches is suboptimal in some cases. These should be converted to the nd_range version and an appropriate default work group size should be found in each case.

Note: basic parallel_for usage is limited to the sycl-ref backend.

Todo

  • Get baseline performance data for relevant kernels
  • Evaluate performance using nd_range kernel launch

Vector

  • Norm (x3)
  • Reciprocal
  • Scale
  • AXPY
  • PointwiseMult

ElementRestriction

Basis

QFunction

Operator

@kris-rowe kris-rowe added the sycl label May 1, 2023
@kris-rowe kris-rowe added this to the Sycl Backend Performance milestone May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant