Releases: ARM-software/ethos-n-driver-stack
Releases · ARM-software/ethos-n-driver-stack
24.07
New features
- Support for Arm NN 24.05
Public API changes
- Add a possibility to get total vertical / horizontal padding from a padding structure.
Other changes
- Fix a cyclic dependency issue when using standalone padding for certain dimensions.
- Fix a precision issue when using standalone padding for certain dimensions.
- Move debug formatting data usage in estimation utilities to improve compilation time.
24.05
New features
- Support for Arm NN 24.02
Public API changes
- None
Other changes
- Fix a potential crash when the device power is cut prematurely by the kernel module.
- Disable dynamic read allocate mode in the NCU MCU.
- Fix a crash where the sleeping function is called from an invalid context in the kernel module.
- Fix int8 bilinear resizing by introducing a support library workaround.
- Fix multiplication tensor addressing for 2d and 1d tensors in the Ethos-N Arm NN backend.
- Fix a memory leak in the Ethos-N Arm NN backend.
Known issues
- Standalone padding and convolution layers of certain dimensions with padding might trigger a cyclic dependency during graph compilation.
23.11
New features
- Add the ability to use llvm-embedded toolchain:
- Please specify the following SCons parameters or place them in your options.py:
- use_llvm_embedded=1
- llvm_embedded_toolchain_path=<path_to_llvm_binaries>
- Add support for elementwise multiplication.
- Please specify the following SCons parameters or place them in your options.py:
Public API changes
- Remove the 'cascading' prefix from the driver stack components.
- Make firmware, model interface, and PLE source code public.
- Command stream version increased to 7.
- Driver Library version increased to 8.
- Support Library version increased to 5.
Other changes
- Remove the '-block-inferences-debug' option from system tests.
- Improve the ordering of commands in the control unit controller.
- Improve performance under higher system latencies.
- Fix bugs related to:
- Handling relative paths when installing the library.
- Caching networks with multiple subgraphs.
- Some IRQ flags are now fetched from the device tree instead of being hardcoded.
- Make sure weight streams payload size fit in the Ethos-N hardware struct.
- Add some useful devloper tools to the repository.
Known issues
- Standalone padding and convolution layers of certain dimensions with padding might trigger a cyclic dependency during graph compilation.
23.08
New features
- Off device compilation is now supported through Arm NN.
- Please set OFFLINE = 1 and PERFORMANCE_VARIANT and PERFORMANCE_SRAM_SIZE_BYTES_OVERRIDE in the Ethos-N Arm NN config file when compiling. This will cause the backend to only compile the network. Use this in conjunction with the caching feature to generate a cached compiled network.
- Copy this cached compiled network to the target device and use it as normal.
- Improved multithreaded performance in the support library.
- Configurable with the ETHOSN_SUPPORT_LIBRARY_NUM_THREADS environment variable. If unset the number of threads is automatically chosen.
- One can use a different allocator e.g. mimalloc for even better performance.
- Runtime performance improvements.
- Cascading can now be performed over branches.
- Preloading weights from later layers.
- Improved allocation to minimise the overlap of buffers in SRAM.
- Sigmoid PLE kernel sped up.
- Support up to 7 padding in Convolution type operators.
- Support Max pooling Stride 1x1.
- Reduce compilation memory requirements in the weight encoder.
- Reduce cached compiled network memory requirements.
- Better error messages reporting from the hardware in the kernel log.
Public API changes
- Command stream version bumped to 6.
- Driver Library version bumped to 7.
- Firmware version bumped to 15.
Other changes
- Fixed a crash in the kernel module when mapping smaller regions of virtual memory.
- Fixed other bugs
Known issues
23.05
Changelog for Arm® Ethos™-N Driver Stack
23.05
New features
- Compiler flag to disable winograd for 7x7 kernels and larger
- Set the following in the scons command line: disable_large_winograd=true
- The cascading compiler is now the default and only compiler
Public API changes
- Command stream major version 4 -> 5
- Support library major version 3 -> 4
- Driver library major version 5 -> 6
Other changes
- No longer using deprecated Arm NN functions in the Arm NN backend
- Network compilation time performance improvements in cascading
- Inference performance improvements in cascading
- Compiled network caching with multiple subgraphs in the Arm NN backend has been fixed
Known issues
- Temporary performance regression on some networks with heavy branching. Performance improvements currently in progress
23.02
Changelog for Arm® Ethos™-N Driver Stack
23.02
New features
- TZMP1 Support
- Per-Process Memory Isolation (SMMU only)
Public API changes
- Kernel supports creating process memory allocator in protected context
- Kernel UAPI changed to use __kernel_size_t to ensure consistent type size
- ProcMemAllocator std::string constructor changed to const char *
- Version number updates:
- Driver library version 4.0.0 → 5.0.0
- Kernel module version 5.0.0 → 6.0.0
- Firmware version 6.0.0 → 11.0.0
Other changes
- Updated list of supported models to a higher performance MobileNet variant
- Suggested development platform changed from Ubuntu 18.04 to Ubuntu 20.04
- Support for Arm NN backend option ProtectedContentAllocation
- Kernel module
- Fix a crash in the kernel module caused by the shared interrupt handler getting triggered during NPU reset
- Buffers are now zeroed out before being freed to not leave any data in the memory
- Kernel module only supports a single binary in the firmware
- Kernel module will only accept an NPU with a matching security level that it was built for
- Kernel module now sets mailbox size to be the nearest power of 2
- Improved error handling in network creation
Known issues
- Dual core with carveout is not supported
Notes
- A workaround for erratum 2838783 is available in Trusted Firmware-A: https://review.trustedfirmware.org/plugins/gitiles/TF-A/trusted-firmware-a/+/00af8f4a7dd75cbbbb597996439233614badd04e
22.11 Release
New features
- None
Public API changes
- Driver Library supports importing an intermediate buffer
- Kernel supports importing intermediate buffers
- Inferences based on networks with imported intermediate buffers cannot run simultaneously on multiple cores, so they will be queued until the previous inference is completed
- Driver library and kernel has new process memory allocator APIs to create buffers and register networks. Support for old APIs is removed. The new APIs are not backwards compatible
- Version number updates:
- Driver Library version 3.0.0 → 4.0.0
- Kernel Module version 4.0.0 → 5.0.0
- Firmware version 5.0.0 → 6.0.0
Other changes
- Public architecture header files make use of assert instead of truncation in set_XXX functions
- Improvements to SmallVector type
- Kernel module
- Fix a crash in the kernel module when the firmware binary changes
- Fix kernel module not picking up a new firmware binary after failing to load a previous one
- Kernel module will now only load firmware binaries that contain an identifying magic number
Known issues
- None
22.08.1 Release
New features
- Estimation mode for split now supports multiple outputs
- Support has been added to use separate SMMU streams for different memory assets e.g. firmware, input/output, command stream
- Device tree layout has been changed to support having multiple SMMU streams
- Multiple asset allocators may be defined in the device tree, however only the first one is currently used
- The Ethos-N NPU kernel module and Trusted Firmware-A driver have been updated to support the new device tree and the use of separate SMMU streams
- Device tree layout has been changed to support having multiple SMMU streams
Public API changes
- The Support Library's CompiledNetwork class now has a function to get how much intermediate buffer memory a network requires
- Version number updates:
- Support Library version 3.1.0 → 3.2.0
- Kernel Module version 3.0.0 → 4.0.0
Other changes
- Fixed the Support Library's compiler allowing an output buffer to be used as an intermediate buffer
- Improvements to SmallVector constructor and operator support
Known issues
- Refer to 22.08 changelog for more details
22.08 Release
New features
- Split operation now supported (see SUPPORTED.md for more information)
Public API changes
- Version number updates:
- Driver Library version 2.0.0 → 3.0.0
- Command Stream version 3.0.0 → 3.1.0
- Support Library version 3.0.1 → 3.1.0
- Kernel Module version 2.0.0 → 3.0.0
- Firmware version 4.0.0 → 5.0.0
Other changes
- Bias quantization fixes
- Input quantization documentation fixes
- Fixed issues not using the correct data formats during cascading for the following:
- Fully connected
- Branching
- Concatenation
- PLE kernels are now mapped as read-only when a SMMU is available
- Fixed power surge issue when clearing SRAM at the beginning of each inference
Known issues
- Some Resize Billinear configurations (align_corners=True, half_pixel_centres=True when heights and widths are not both even or both odd) produce inaccurate results
- Warnings from Arm NN for using a deprecated 'ConstTensorsAsInputs' API call
22.05 Release
New features
- Zero copy support:
- Support added for using the Import API with Arm NN using dma_buf.
Public API changes
- Added new API for importing a dma_buf file descriptor.
Other changes
- Extended the range of OFM multiplier.
- Added support for deallocation of SRAM buffers in the Cascading Support Library.
- Limited the maximum size of a Section generated by the Cascading Support Library to the corresponding size supported by the Firmware.
- Reduced inference latency.
- Reduced estimation time for networks with Parts without weights.
Known issues
- None