Skip to content

Decouple free ab#678

Merged
bensander merged 12 commits into
ROCm:developfrom
bensander:decouple_free_ab
Sep 30, 2019
Merged

Decouple free ab#678
bensander merged 12 commits into
ROCm:developfrom
bensander:decouple_free_ab

Conversation

@bensander
Copy link
Copy Markdown
Contributor

No description provided.

- allow strides < minStrides to be specified
- compute m_totalAllocatedElements to gracefully handle alternate
strides
- pass setConstStride overloads to tensile-client in non-GEMM mode too
- set defaults ld* to -1 not 0
- add UseDefaultStride to class
- dimensionPadding return int64_t since can be <0
- whitespace
- use proper totalAllocatedElements formula for strides
- add a few tests for zero strides, simple padding, and strides<size
- allow A and B to have different # free indices
- modify python and C++ data structures to track FreeIndex with 'isA'
field (rather than a combo that required free index be in both A and B)
- replace freeSizes() with freeSizesA() and freeSizesB()
@bensander bensander requested a review from sdquiring September 30, 2019 01:07
m_freeSizeB[i] = std::max({m_b.sizes()[m_freeIndices[i].b],
m_c.empty() ? 0 : m_c.sizes()[m_freeIndices[i].cb],
m_d.sizes()[m_freeIndices[i].db]});
size_t maxSize=0; // TODO - aren't these all the same?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe each dimension has same size across A/B/C/D tensors. So not clear why we need the max ? Can just read it from D, and if desired double-check that size in A or B is the same?

@bensander bensander merged commit 9469230 into ROCm:develop Sep 30, 2019
saadrahim pushed a commit to saadrahim/Tensile that referenced this pull request Oct 3, 2019
* Use a similar method as in rocBLAS to determine whether to use C++ and HIP features in tensile_bfloat16.h

* Make bfloat16 changes similar to rocBLAS PR ROCm#678
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants