-
Notifications
You must be signed in to change notification settings - Fork 973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpu: sycl: sum: implemented #1940
base: main
Are you sure you want to change the base?
Conversation
479719a
to
70d31dc
Compare
Is the CI timing out on Win Server 2022 expected? I have the same happening on another PR after rebasing on main, just like on this one. |
@t4c1, these timeouts are expected. Looks like MSVC update from 19.39.33523.0 to 19.40.33811.0 in Azure environment breaks |
src/xpu/sycl/types.hpp
Outdated
#define DNNL_ARG_SRC_14 15 | ||
#define DNNL_ARG_SRC_15 16 | ||
|
||
#define MAX_NUM_TENSORS 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this specific to sycl sum implementation? If so I would advice to name it accordingly (e.g. DNNL_REF_SUM_MAX_NUM_TENSORS)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Add SYCL implementation for sum primitive. There is one implementation that handles up to 16 inputs and another one that repeatedly uses the first one for more inputs.
co-authored with @kala855