Skip to content
Discussion options

You must be logged in to vote

So my question is why do we change thread layout (i.e. how tid maps to indices) based on the vectorization dimension ?

I know (I think) how to answer that one. It's based on coalescence. If your physical layout is (say) MxK, then K is the contiguous dimension and you want thread 0 on element (0,0) thread 1 on element (0,1). If you see here: https://github.com/ROCmSoftwarePlatform/rocMLIR/pull/996/files we were originally always distributing the thread ids on the K dimension. But if the matrix was KxM (or KxN) this would create non-coalesced access, i.e., the thread would access the matrix in a strided fashion. So I think that doing :

splitId.merge({"k_thread", dThreadName}, {4, 5}, "tid…

Replies: 3 comments 7 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
5 replies
@manupak
Comment options

@manupak
Comment options

@giuseros
Comment options

@manupak
Comment options

@giuseros
Comment options

Answer selected by manupak
Comment options

You must be logged in to vote
2 replies
@manupak
Comment options

@krzysz00
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants