Skip to content

Conversation

@adstraw
Copy link
Contributor

@adstraw adstraw commented Feb 23, 2022

Adds support to lower cache_read and cache_write schedule primitives to Hexagon User DMA using tensorize and new mem_copy intrinsic.

@adstraw
Copy link
Contributor Author

adstraw commented Feb 23, 2022

HUGE thanks to @masahi who also contributed to this PR.

@masahi masahi self-assigned this Feb 24, 2022
Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @adstraw and @masahi for getting the pattern matching from cache_read/write to DMA working!

* Same semantics as memcpy(destination, source, size)
* Allows for device specific implementations e.g. direct memory access (DMA)
*/
TVM_DLL const Op& mem_copy();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @tqchen @junrushao1994 if they want a better name here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make it more specific? We have a few examples above like texture2d_load/store. What about dma_mem_copy() or any hexagon-specific name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about something with "DMA" in the name but I realized that DMA is more of an implementation detail. I ended up referencing DMA in the comments instead. Happy to take suggestions including dma_mem_copy.

Copy link
Member

@masahi masahi Feb 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about:

  • mem_copy_local (to emphasize the local nature of the intended usage, unlike TVMArrayCopyFromTo @junrushao1994 mentioned which might involve RPC)
  • mem_copy_1d (if we intend to add 2d rectangular memcpy later)

But both of them don't particularly sound better.

@masahi
Copy link
Member

masahi commented Feb 24, 2022

cc @kparzysz-quic

@masahi masahi merged commit dcbdedd into apache:main Feb 25, 2022
@adstraw adstraw deleted the hexagon-tir-to-dma-lower branch February 25, 2022 19:16
pfk-beta pushed a commit to pfk-beta/tvm that referenced this pull request Apr 11, 2022
…#10365)

* Lower cache_read and cache_write to Hexagon DMA via tensorize

* rework test to be compatible with launcher

* remove cpu device api mem_copy implementation and test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants