-
Notifications
You must be signed in to change notification settings - Fork 3.7k
NEON kernels for NCHWc Convolution and Pooling #25580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
cc30a14
Rewire ORT to support a NEON version of NCHWc Conv
Rohanjames1997 f190c0d
Remove reference to assembly file
Rohanjames1997 632870b
Add a NEON kernel for Pointwise Convolution
Rohanjames1997 159570a
Add a NEON kernel for Depthwise
Rohanjames1997 52f09bf
Remove placeholder implementations
Rohanjames1997 b505bd6
Add placeholder kernel for MlasConvNchwcFloatKernelNeon
Rohanjames1997 790cc7e
Fix MlasConvNchwcFloatKernelNeon
Rohanjames1997 906393a
Use MLAS intrinsics for MlasConvNchwcFloatKernelNeon
Rohanjames1997 4d322e6
Add MlasConvNchwFloatKernelNeon
Rohanjames1997 cb06a1a
Add placeholder NCHWc Pool
Rohanjames1997 00caa4c
Vanilla C++ implementation
Rohanjames1997 4cead5e
Intrinsics for Pooling
Rohanjames1997 abd5491
Refactored to share code
Rohanjames1997 74e0e3b
Format file & delete unused header
Rohanjames1997 16be947
Minor modifications to pass more tests
Rohanjames1997 f7d971d
Remove unnecessary code & formatting changes
Rohanjames1997 0ff394c
Refactor to share some code
Rohanjames1997 bd2b6c4
Change block size to 16
Rohanjames1997 2b78377
Update pooling algorithm for block size 16
Rohanjames1997 ee9b943
Remove comment
Rohanjames1997 23425e8
Add correct header and refactor kernels to share code.
Rohanjames1997 7000e9f
Address Copilot comments
Rohanjames1997 c5c3f05
Extend kernels to Windows & Apple
Rohanjames1997 619d87c
Merge remote-tracking branch 'upstream/main' into nchwc_conv_pool
Rohanjames1997 506bf05
Hardcode BlockSize to 16 and add it to the header.
Rohanjames1997 fb5fb50
Increase android build size to 10% higher than the CI-reported size o…
Rohanjames1997 fb99f7d
Centralize MLAS_NEON_NCHWC_BLOCK_SIZE
Rohanjames1997 aa21aca
Merge branch 'microsoft:main' into nchwc_conv_pool
Rohanjames1997 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| /*++ | ||
|
|
||
| Copyright (c) Microsoft Corporation. All rights reserved. | ||
|
|
||
| Licensed under the MIT License. | ||
|
|
||
| Module Name: | ||
|
|
||
| sconv.h | ||
|
|
||
| Abstract: | ||
|
|
||
| This module defines convolution kernel flags for configuring convolution | ||
| operations including output accumulation, bias addition, and activations. | ||
|
|
||
| --*/ | ||
|
|
||
| // | ||
| // Define the convolution kernel flags. | ||
| // | ||
|
|
||
| #define MLAS_CONV_KERNEL_FLAG_ACCUMULATE_OUTPUT 0x00000001 | ||
| #define MLAS_CONV_KERNEL_FLAG_BIAS_ADDITION 0x00000002 | ||
| #define MLAS_CONV_KERNEL_FLAG_RELU_ACTIVATION 0x00000004 | ||
| #define MLAS_CONV_KERNEL_FLAG_OTHER_ACTIVATION 0x00000008 |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.