Skip to content

Commit

Permalink
Merge pull request aws-neuron#20 for release of Neuron SDK 2.20.
Browse files Browse the repository at this point in the history
Adding many of the innovative kernels used with-in the neuron-compiler such as mamba and flash attention as open-source samples. Adding updated Readme and GitHub issue and pull request templates. For more information, see the latest documentation at "https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki" for an in-depth getting started, architecture, profiling, and performance guide, along with multiple tutorials, api reference documents, documented known issues and frequently asked questions. Updated Readme and added issue templates, pull request templates, and contribution guidelines.
  • Loading branch information
neuron-code-sharing-robot authored and JonathanHenson committed Sep 13, 2024
1 parent ca00722 commit 5d97201
Show file tree
Hide file tree
Showing 64 changed files with 6,423 additions and 21 deletions.
75 changes: 75 additions & 0 deletions .github/ISSUE_TEMPLATE/bug-report.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
name: "🐛 Bug Report"
description: Report a bug
title: "(short issue description)"
labels: [bug, needs-triage]
assignees: []
body:
- type: textarea
id: description
attributes:
label: Describe the bug
description: What is the problem? A clear and concise description of the bug.
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected Behavior
description: |
What did you expect to happen?
validations:
required: true
- type: textarea
id: current
attributes:
label: Current Behavior
description: |
What actually happened?
Please include full errors, uncaught exceptions, stack traces, and relevant logs.
If service responses are relevant, please include wire logs.
validations:
required: true
- type: textarea
id: reproduction
attributes:
label: Reproduction Steps
description: |
Provide a self-contained, concise snippet of code that can be used to reproduce the issue.
For more complex issues provide a repo with the smallest sample that reproduces the bug.
Avoid including business logic or unrelated code, it makes diagnosis more difficult.
The code sample should be an SSCCE. See http://sscce.org/ for details. In short, please provide a code sample that we can copy/paste, run and reproduce.
validations:
required: true
- type: textarea
id: solution
attributes:
label: Possible Solution
description: |
Suggest a fix/reason for the bug
validations:
required: false
- type: textarea
id: context
attributes:
label: Additional Information/Context
description: |
Anything else that might be relevant for troubleshooting this bug. Providing context helps us come up with a solution that is most useful in the real world.
validations:
required: false

- type: input
id: neuronx-cc-version
attributes:
label: neuronx-cc version used
validations:
required: true

- type: input
id: frameworks-used
attributes:
label: Framework(s) and their versions used (JAX, PyTorch, etc..)
validations:
required: false
5 changes: 5 additions & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
blank_issues_enabled: false
contact_links:
- name: 💬 General Question
url: https://github.com/aws-neuron/nki-samples/discussions/categories/q-a
about: Please ask and answer questions as a discussion thread
23 changes: 23 additions & 0 deletions .github/ISSUE_TEMPLATE/documentation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
name: "📕 Documentation Issue"
description: Report an issue in the API Reference documentation, API Stubs, or Developer Guide
title: "(short issue description)"
labels: [documentation, needs-triage]
assignees: []
body:
- type: textarea
id: description
attributes:
label: Describe the issue
description: A clear and concise description of the issue.
validations:
required: true

- type: textarea
id: links
attributes:
label: Links
description: |
Include links to affected documentation page(s).
validations:
required: true
47 changes: 47 additions & 0 deletions .github/ISSUE_TEMPLATE/feature-request.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
name: 🚀 Feature Request
description: Suggest an idea for this project
title: "(short issue description)"
labels: [feature-request, needs-triage]
assignees: []
body:
- type: textarea
id: description
attributes:
label: Describe the feature
description: A clear and concise description of the feature you are proposing.
validations:
required: true
- type: textarea
id: use-case
attributes:
label: Use Case
description: |
Why do you need this feature? For example: "I'm always frustrated when..."
validations:
required: true
- type: textarea
id: solution
attributes:
label: Proposed Solution
description: |
Suggest how to implement the addition or change. Please include prototype/workaround/sketch/reference implementation.
validations:
required: false
- type: textarea
id: other
attributes:
label: Other Information
description: |
Any alternative solutions or features you considered, a more detailed explanation, stack traces, related issues, links for context, etc.
validations:
required: false
- type: checkboxes
id: ack
attributes:
label: Acknowledgements
options:
- label: I may be able to implement this feature request
required: false
- label: This feature might incur a breaking change
required: false
21 changes: 21 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
### Issue #, if available:

\<Fill ME\>

### Description of changes:
\<Fill ME\>

### Testing:

Please see detailed unit test requirements in the [CONTRIBUTING.md](https://github.com/aws-neuron/nki-samples/blob/main/CONTRIBUTING.md)

- [ ] The change is covered by numeric check using `nki.baremetal`
- [ ] The change is covered by performance benchmark test using `nki.benchmark`
- [ ] The change is covered by end-to-end integration test

### Pull Request Checklist

- [ ] I have filled in all the required field in the template
- [ ] I have tested locally that all the tests pass
- [ ] By submitting this pull request, I confirm that my contribution is made under the terms of the MIT-0 license.

1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
build
6 changes: 3 additions & 3 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## Code of Conduct
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
[email protected] with any additional questions or comments.
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
[email protected] with any additional questions or comments.
73 changes: 66 additions & 7 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ information to effectively respond to your bug report or contribution.

We welcome you to use the GitHub issue tracker to report bugs or suggest features.

When filing an issue, please check existing open, or recently closed issues to make sure somebody else hasn't already
When filing an issue, please check [existing open](https://github.com/aws-neuron/nki-samples/issues), or [recently closed](https://github.com/aws-neuron/nki-samples/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aclosed%20), issues to make sure somebody else hasn't already
reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:

* A reproducible test case or series of steps
Expand All @@ -30,17 +30,74 @@ To send us a pull request, please:

1. Fork the repository.
2. Modify the source; please focus on the specific changes you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
3. Ensure local tests pass.
3. Please ensure your change satisfies the requirements listed in [Testing Requirements](#testing-requirements) and [Coding Guidelines](#coding-guidelines)
4. Commit to your fork using clear commit messages.
5. Send us a pull request, answering any default questions in the pull request interface.
6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
6. Wait for a repository collaborator to look at your pull request, run the automated tests, and review. If additional changes or discussion is needed, a collaborator will get back to you, so please stay involved in the conversation.
* Note: pull requests will be tested, staged, and released in a process internal to the Neuron team. Changes will be reflected in a subsequent release

GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
[creating a pull request](https://help.github.com/articles/creating-a-pull-request/).

### Testing Requirements
Running the binaries for a NKI kernel require Neuron devices on an AWS EC2 instance from trn1, trn1n, or inf2 instance families.
Details on setting up an instance can be found in [here](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-setup.html).

## Finding contributions to work on
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
If you would like to test your kernel without requiring a Neuron device, you can use `nki.simulate()` to run your kernel using `NumPy` input/output tensors and types.
An example can be found in the [layernorm tutorial test](test/unit/test_tutorials_layernorm.py). However, kernels with _only_ simulation tests will not be accepted.

#### Requirements for Kernels Targeting `src/reference/`

All kernels located in this folder need to have the following tests.

1. Numeric accuracy tests with `nki.baremetal`. The output from the kernel
must be validated against a CPU reference implementation. See `test_flash_attn_fwd_numerical` in [test_flash_attn_fwd.py](test/unit/test_flash_attn_fwd.py) as an example. Documentation for `nki.baremetal` is available at [here](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/api/generated/nki.baremetal.html).

2. Performance benchmark tests with `nki.benchmark`. The unit test must have performance checks. At a minimum, put an assertion to verify p99 latency meets a certain threshold. See `test_flash_attn_fwd_perf` in [test_flash_attn_fwd.py](test/unit/test_flash_attn_fwd.py) as an example. Documentation for `nki.benchmark` is available at [here](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/api/generated/nki.benchmark.html)

3. End-to-End integration tests that use your kernel in a model.

a. Each test should be in its own separate folder.

b. Each Test must have a `run.sh` script, that accepts an argument \<path to test_result.json\>. See [run.sh of FlashAttention](test/integration/flash_attention/run.sh) as an example.

c. The test scripts must produce benchmark results with the `benchmark` function, located in [LatencyCollector.py](test/integration/perf_utils/LatencyCollector.py). The `benchmark` function will write the latency of your E2E model to the `test_result.json`.

d. Register your test target in [run_integration.sh](test/integration/run_integration.sh).


### Coding Guidelines
Most guidelines are covered by a **PEP-8** check on all newly submitted code, which covers aspects such as code layout and basic Python naming conventions.
In addition to PEP-8, we use the following NKI specific style guidelines:

1. **Abbreviations**
* Importing NKI modules should use consistent names. For example,
```
import neuronxcc.nki as nki
import neuronxcc.nki.isa as nisa
import neuronxcc.nki.language as nl
import neuronxcc.nki.typing as nt
import numpy as np
```
2. Variable Names
* Indexing should specify partition and free dimensions along with the variable they are used for. For example:
The index for the partition dimension for tile `a` would be
```
i_p_a = nl.arange(128)[:, None]
```
while the index for the free dimension for tile `b` would be
```
i_f_b = nl.arange(512)[None, :]
```
* Name loop variables, indices, and buffers consistently, and specify their intended use in the name.
3. Documentation
* New kernels should containing inline docstrings that describe the semantics of the kernel, and provide information on the IO layout.
Upon release, we generate the documentation for our kernels and merge them into the NKI API documentation which will appear in the official AWS NKI documentation.
## Finding Contributions to Work on
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/aws-neuron/nki-samples/labels/help%20wanted) issues is a great place to start.
## Code of Conduct
Expand All @@ -49,10 +106,12 @@ For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of
[email protected] with any additional questions or comments.
## Security issue notifications
## Security Issue Notifications
If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
## Licensing
See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
See the [LICENSE](https://github.com/aws-neuron/nki-samples/blob/main/LICENSE.txt) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes.
3 changes: 1 addition & 2 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,4 @@ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
68 changes: 59 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,67 @@
## My Project
# Neuron Kernel Interface (NKI) Samples

TODO: Fill this README out!
[AWS Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/) is the software development kit (SDK) designed for ML chips AWS Trainium and Inferentia:
purpose built for AI workloads.
At the core of the Neuron SDK is the Neuron Compiler, which takes computation graphs from frameworks like PyTorch and JAX and converts
them into highly optimized machine code.

Be sure to:
[NKI](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki) is a Python-based programming environment designed for the compiler which
adopts commonly used NumPy and Triton-like syntax along with tile-level semantics.
NKI also interoperates with the Neuron Profiler, providing insights into performance bottlenecks and instruction latencies.
It offers tensor printing support, standard error messaging, and built-in kernel simulation capabilities for efficient debugging purposes.
NKI offers two types of programming interfaces:
NKI Language ([nki.language](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/api/nki.language.html)) and
NKI Instruction Set Architecture ([nki.isa](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/api/nki.isa.html)),
enabling bare-metal access to the chip for full control.

* Change the title in this README
* Edit your repository description on GitHub
![alt "High-level flow of NKI in the Neuron Compiler. NKI emits IR immediately before the backend-IR compilation stage"](doc_assets/high-level-nki-flow.png#center "High-Level NKI Flow")

## Security
## Documentation
The latest NKI documentation can be found on the AWS Documentation site, [here](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/).
Documentation for NKI kernels are both inline (docstring) and available on the documentation site's
[kernel API reference page](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/api/nki.kernels.html).

See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.
## Repository Structure

## License
### src

This library is licensed under the MIT-0 License. See the LICENSE file.
#### reference
This folder contains the source code of the `neuronxcc.nki.kernels`, and they are optimized kernels from the Neuron Team serving as samples.

All kernels located in this folder have numeric accuracy tests
and performance benchmarks defined in the [test](test/) directory. We also demonstrate using these kernels end-to-end in our [integration tests](test/integration/).

Note that these kernels are already being deployed as part of the Neuron stack. With flash attention as an example,
[compiling Llama models with transformers-neuronx](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/transformers-neuronx/transformers-neuronx-developer-guide.html)
will automatically invoke the `flash_fwd` kernel in [attention.py](src/reference/attention.py). Therefore, replacing the framework operators with these NKI kernels likely won't result in extra performance benefit.


#### tutorials
The [tutorial kernels](src/tutorials/) are for educational purpose and include the kernels that are used in NKI guides.
You can clone these sample kernels and run them directly while reading through the
[NKI documentation](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/tutorials.html). These kernels are not necessarily high-performance,
but contain detailed inline comments and have accompanying documentation.

### test

#### unit
The [unit tests](test/unit) directory contains unit tests and micro-benchmarks for standalone kernels. They run across multiple possible configurations,
verify the numeric accuracy of the operation, and publish performance results to the [micro-benchmark](docs/benchmarks/micro-benchmark/) results.

#### integration
The [integration tests](tests/integration) folder contains integration tests of (selected) kernels. They verify the numeric accuracy of the model’s output,
and publish end-to-end performance results into the [integration benchmarks](docs/benchmarks/integration) folder.

## Maintenance Policy
NKI is currently released as **beta** while we gather feedback from our users and integrate it into the API. NKI API follow the [Neuron SDK Maintenance Policy](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/sdk-policy.html).

## Getting Help
Have a look at the GitHub issues for this repository where you will find past issues customers have encountered with workarounds and clarifications.
If you cannot find a suitable issue for your use-case feel free to [file an issue](https://github.com/aws-neuron/nki-samples/issues/new) to ask for assistance or to suggest improvements. Please read [CONTRIBUTING.md](CONTRIBUTING.md) for detailed information on submitting issues.

## Contributing
We invite you to join the NKI community! If you'd like to share kernels you create with the community, we welcome your contributions to this repository via
GitHub pull-requests as well as through filed issues discussing features, bug fixes, new use-cases, and API improvements. Please see [CONTRIBUTING.md](CONTRIBUTING.md) for more information

## Licensing
This repository is licensed under the terms of the [MIT-0 License](LICENSE.txt)
Binary file added doc_assets/high-level-nki-flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc_assets/pm-nc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions src/reference/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Copyright (c) 2023, Amazon.com. All Rights Reserved

"""
Package containing public kernels for Neuron Kernel Interface (NKI).
Kernels here are the same to the ones available in the
NKI Github Sample Repo.
TODO: Insert link to Github Repo when available
"""
from neuronxcc.nki.kernels.attention import fused_self_attn_for_SD_small_head_size, flash_attn_bwd, flash_fwd
from neuronxcc.nki.kernels.vision import resize_nearest_fixed_dma_kernel, select_and_scatter_kernel
Loading

0 comments on commit 5d97201

Please sign in to comment.