-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[AMD] Refactor kWidth Assignment #8792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| // CHECK-LABEL: mfma_chain_dot_kWidth_bf16 | ||
| // CHECK-GFX950: tt.dot {{.*}} : {{.*}} -> tensor<128x128xf32, #mma> | ||
| // CHECK-GFX950: tt.dot {{.*}} : tensor<128x128xbf16, #ttg.dot_op<{opIdx = 0, parent = #mma, kWidth = 4}>> * tensor<128x128xbf16, #ttg.dot_op<{opIdx = 1, parent = #mma, kWidth = 4}>> -> {{.*}} | ||
| // CHECK-GFX950: tt.dot {{.*}} : tensor<128x128xbf16, #ttg.dot_op<{opIdx = 0, parent = #mma, kWidth = 8}>> * tensor<128x128xbf16, #ttg.dot_op<{opIdx = 1, parent = #mma, kWidth = 8}>> -> {{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this change intentional? The refactoring should not change the test case,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is related to a request from a customer for kWidth=8 for chained dot.
| newBType = RankedTensorType::get(newBType.getShape(), newBType.getElementType(), newBEnc); | ||
|
|
||
| Value newA = ttg::ConvertLayoutOp::create(rewriter, loc, newAType, dotOp.getA()); | ||
| Value newB = ttg::ConvertLayoutOp::create(rewriter, loc, newBType, dotOp.getB()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works when the original convertlayoutOp is right before the dotOp. A more general way is to go backward and find the corresponding convertLayoutOp (skip other ops in between) and either replace it with a new one or add a new one after it. If you add a new converLayoutOp, I think removeConverLayout can combine two adjacent convertLayoutOp into one. If there're other ops in between, I doubt removeConvertLayout can combine them. Then, you'll still have a convertLayoutOp with am undesired kWidth.
third_party/amd/lib/TritonAMDGPUTransforms/AccelerateAMDMatmul.cpp
Outdated
Show resolved
Hide resolved
third_party/amd/lib/TritonAMDGPUTransforms/AccelerateAMDMatmul.cpp
Outdated
Show resolved
Hide resolved
| return 4; | ||
| else if (!tail) | ||
| return kBase * kPack; | ||
| return kBase; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to return the original kWidth here.
Refactored
kWidthassignment forDotOpout of the primary rewrite rules into a separate rewrite rule.