Skip to content

Releases: ARM-software/ethos-n-driver-stack

24.07

09 Aug 19:37
Compare
Choose a tag to compare

New features

  • Support for Arm NN 24.05

Public API changes

  • Add a possibility to get total vertical / horizontal padding from a padding structure.

Other changes

  • Fix a cyclic dependency issue when using standalone padding for certain dimensions.
  • Fix a precision issue when using standalone padding for certain dimensions.
  • Move debug formatting data usage in estimation utilities to improve compilation time.

24.05

10 May 15:28
Compare
Choose a tag to compare

New features

  • Support for Arm NN 24.02

Public API changes

  • None

Other changes

  • Fix a potential crash when the device power is cut prematurely by the kernel module.
  • Disable dynamic read allocate mode in the NCU MCU.
  • Fix a crash where the sleeping function is called from an invalid context in the kernel module.
  • Fix int8 bilinear resizing by introducing a support library workaround.
  • Fix multiplication tensor addressing for 2d and 1d tensors in the Ethos-N Arm NN backend.
  • Fix a memory leak in the Ethos-N Arm NN backend.

Known issues

  • Standalone padding and convolution layers of certain dimensions with padding might trigger a cyclic dependency during graph compilation.

23.11

19 Jan 15:09
Compare
Choose a tag to compare

New features

  • Add the ability to use llvm-embedded toolchain:
    • Please specify the following SCons parameters or place them in your options.py:
      • use_llvm_embedded=1
      • llvm_embedded_toolchain_path=<path_to_llvm_binaries>
    • Add support for elementwise multiplication.

Public API changes

  • Remove the 'cascading' prefix from the driver stack components.
  • Make firmware, model interface, and PLE source code public.
  • Command stream version increased to 7.
  • Driver Library version increased to 8.
  • Support Library version increased to 5.

Other changes

  • Remove the '-block-inferences-debug' option from system tests.
  • Improve the ordering of commands in the control unit controller.
  • Improve performance under higher system latencies.
  • Fix bugs related to:
    • Handling relative paths when installing the library.
    • Caching networks with multiple subgraphs.
  • Some IRQ flags are now fetched from the device tree instead of being hardcoded.
  • Make sure weight streams payload size fit in the Ethos-N hardware struct.
  • Add some useful devloper tools to the repository.

Known issues

  • Standalone padding and convolution layers of certain dimensions with padding might trigger a cyclic dependency during graph compilation.

23.08

14 Sep 16:01
Compare
Choose a tag to compare

New features

  • Off device compilation is now supported through Arm NN.
    • Please set OFFLINE = 1 and PERFORMANCE_VARIANT and PERFORMANCE_SRAM_SIZE_BYTES_OVERRIDE in the Ethos-N Arm NN config file when compiling. This will cause the backend to only compile the network. Use this in conjunction with the caching feature to generate a cached compiled network.
    • Copy this cached compiled network to the target device and use it as normal.
  • Improved multithreaded performance in the support library.
    • Configurable with the ETHOSN_SUPPORT_LIBRARY_NUM_THREADS environment variable. If unset the number of threads is automatically chosen.
    • One can use a different allocator e.g. mimalloc for even better performance.
  • Runtime performance improvements.
    • Cascading can now be performed over branches.
    • Preloading weights from later layers.
    • Improved allocation to minimise the overlap of buffers in SRAM.
    • Sigmoid PLE kernel sped up.
  • Support up to 7 padding in Convolution type operators.
  • Support Max pooling Stride 1x1.
  • Reduce compilation memory requirements in the weight encoder.
  • Reduce cached compiled network memory requirements.
  • Better error messages reporting from the hardware in the kernel log.

Public API changes

  • Command stream version bumped to 6.
  • Driver Library version bumped to 7.
  • Firmware version bumped to 15.

Other changes

  • Fixed a crash in the kernel module when mapping smaller regions of virtual memory.
  • Fixed other bugs

Known issues

23.05

23 May 16:51
Compare
Choose a tag to compare

Changelog for Arm® Ethos™-N Driver Stack

23.05

New features

  • Compiler flag to disable winograd for 7x7 kernels and larger
    • Set the following in the scons command line: disable_large_winograd=true
  • The cascading compiler is now the default and only compiler

Public API changes

  • Command stream major version 4 -> 5
  • Support library major version 3 -> 4
  • Driver library major version 5 -> 6

Other changes

  • No longer using deprecated Arm NN functions in the Arm NN backend
  • Network compilation time performance improvements in cascading
  • Inference performance improvements in cascading
  • Compiled network caching with multiple subgraphs in the Arm NN backend has been fixed

Known issues

  • Temporary performance regression on some networks with heavy branching. Performance improvements currently in progress

23.02

09 Mar 18:01
Compare
Choose a tag to compare

Changelog for Arm® Ethos™-N Driver Stack

23.02

New features

  • TZMP1 Support
  • Per-Process Memory Isolation (SMMU only)

Public API changes

  • Kernel supports creating process memory allocator in protected context
  • Kernel UAPI changed to use __kernel_size_t to ensure consistent type size
  • ProcMemAllocator std::string constructor changed to const char *
  • Version number updates:
    • Driver library version 4.0.0 → 5.0.0
    • Kernel module version 5.0.0 → 6.0.0
    • Firmware version 6.0.0 → 11.0.0

Other changes

  • Updated list of supported models to a higher performance MobileNet variant
  • Suggested development platform changed from Ubuntu 18.04 to Ubuntu 20.04
  • Support for Arm NN backend option ProtectedContentAllocation
  • Kernel module
    • Fix a crash in the kernel module caused by the shared interrupt handler getting triggered during NPU reset
    • Buffers are now zeroed out before being freed to not leave any data in the memory
    • Kernel module only supports a single binary in the firmware
    • Kernel module will only accept an NPU with a matching security level that it was built for
    • Kernel module now sets mailbox size to be the nearest power of 2
    • Improved error handling in network creation

Known issues

  • Dual core with carveout is not supported

Notes

22.11 Release

28 Nov 17:00
Compare
Choose a tag to compare

New features

  • None

Public API changes

  • Driver Library supports importing an intermediate buffer
  • Kernel supports importing intermediate buffers
    • Inferences based on networks with imported intermediate buffers cannot run simultaneously on multiple cores, so they will be queued until the previous inference is completed
  • Driver library and kernel has new process memory allocator APIs to create buffers and register networks. Support for old APIs is removed. The new APIs are not backwards compatible
  • Version number updates:
    • Driver Library version 3.0.0 → 4.0.0
    • Kernel Module version 4.0.0 → 5.0.0
    • Firmware version 5.0.0 → 6.0.0

Other changes

  • Public architecture header files make use of assert instead of truncation in set_XXX functions
  • Improvements to SmallVector type
  • Kernel module
    • Fix a crash in the kernel module when the firmware binary changes
    • Fix kernel module not picking up a new firmware binary after failing to load a previous one
    • Kernel module will now only load firmware binaries that contain an identifying magic number

Known issues

  • None

22.08.1 Release

13 Oct 08:20
Compare
Choose a tag to compare

New features

  • Estimation mode for split now supports multiple outputs
  • Support has been added to use separate SMMU streams for different memory assets e.g. firmware, input/output, command stream
    • Device tree layout has been changed to support having multiple SMMU streams
      • Multiple asset allocators may be defined in the device tree, however only the first one is currently used
    • The Ethos-N NPU kernel module and Trusted Firmware-A driver have been updated to support the new device tree and the use of separate SMMU streams

Public API changes

  • The Support Library's CompiledNetwork class now has a function to get how much intermediate buffer memory a network requires
  • Version number updates:
    • Support Library version 3.1.0 → 3.2.0
    • Kernel Module version 3.0.0 → 4.0.0

Other changes

  • Fixed the Support Library's compiler allowing an output buffer to be used as an intermediate buffer
  • Improvements to SmallVector constructor and operator support

Known issues

  • Refer to 22.08 changelog for more details

22.08 Release

25 Aug 16:13
Compare
Choose a tag to compare

New features

  • Split operation now supported (see SUPPORTED.md for more information)

Public API changes

  • Version number updates:
    • Driver Library version 2.0.0 → 3.0.0
    • Command Stream version 3.0.0 → 3.1.0
    • Support Library version 3.0.1 → 3.1.0
    • Kernel Module version 2.0.0 → 3.0.0
    • Firmware version 4.0.0 → 5.0.0

Other changes

  • Bias quantization fixes
  • Input quantization documentation fixes
  • Fixed issues not using the correct data formats during cascading for the following:
    • Fully connected
    • Branching
    • Concatenation
  • PLE kernels are now mapped as read-only when a SMMU is available
  • Fixed power surge issue when clearing SRAM at the beginning of each inference

Known issues

  • Some Resize Billinear configurations (align_corners=True, half_pixel_centres=True when heights and widths are not both even or both odd) produce inaccurate results
  • Warnings from Arm NN for using a deprecated 'ConstTensorsAsInputs' API call

22.05 Release

26 May 17:39
Compare
Choose a tag to compare

New features

  • Zero copy support:
    • Support added for using the Import API with Arm NN using dma_buf.

Public API changes

  • Added new API for importing a dma_buf file descriptor.

Other changes

  • Extended the range of OFM multiplier.
  • Added support for deallocation of SRAM buffers in the Cascading Support Library.
  • Limited the maximum size of a Section generated by the Cascading Support Library to the corresponding size supported by the Firmware.
  • Reduced inference latency.
  • Reduced estimation time for networks with Parts without weights.

Known issues

  • None