-
Notifications
You must be signed in to change notification settings - Fork 251
[CK TILE GEMM] Refactor block_scale_gemm examples #3181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the block scale GEMM quantization example code to improve maintainability and modularity. The main change is reorganizing the code from a monolithic implementation into a factory pattern with separate compilation units for different quantization configurations.
Key changes:
- Refactored code to use a lookup table (LUT) with factory functions for different quantization type combinations
- Moved the
gemm_calc_quantfunction fromgemm_quant_basic.cpptorun_gemm_quant_example.incfor reuse - Changed
run_gemm_example_with_layoutsto acceptArgParserdirectly instead ofargc/argv
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| run_gemm_quant_example.inc | Added gemm_calc_quant function and run_gemm_example_prec_type helper; modified run_gemm_example_with_layouts to accept ArgParser |
| gemm_utils.hpp | Added hash_multiple_strings function; removed create_args function |
| gemm_quant.cpp | New main entry point with factory pattern using LUT to dispatch to specific implementations |
| gemm_quant_basic.cpp | Deleted - functionality moved to other files |
| gemm_bquant_quantgourped_*.cpp | New factory implementation files for different BQuant data type combinations |
| gemm_aquant_quantgrouped.cpp | New factory implementation for AQuant configurations |
| CMakeLists.txt | Updated to compile new modular structure instead of single file |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
example/ck_tile/38_block_scale_gemm/gemm_bquant_quantgrouped_preshuffleb_prefill.cpp
Show resolved
Hide resolved
f952017 to
9debcc1
Compare
Review Addressed
- Split cpp file to reduce building time - Support multiple GemmConfig
- Update Readme
- Add support for rowcol and tensor GEMM operations
- Update README
9debcc1 to
d0c9a38
Compare
- Set quant group size to (1, 1, 64) for targets excluding gfx950, where warp tile size (16, 16, 128) is incompatible.
example/ck_tile/38_block_scale_gemm/gemm_aquant_quantgrouped.cpp
Outdated
Show resolved
Hide resolved
* [CK TILE GEMM] Refactor block_scale_gemm examples - Split cpp file to reduce building time - Support multiple GemmConfig * [CK TILE GEMM] Refactor block_scale_gemm examples - Update Readme * [CK TILE GEMM] Refactor block_scale_gemm examples - Add support for rowcol and tensor GEMM operations * [CK TILE GEMM] Refactor block_scale_gemm examples - Update README * [CK TILE GEMM] Refactor block_scale_gemm examples - Set quant group size to (1, 1, 64) for targets excluding gfx950, where warp tile size (16, 16, 128) is incompatible.
Proposed changes
The Gemm quant example faces two problems:
In this PR, we'll split different instances into separate .cpp files to reduce compile time, and support multiple GemmConfig options selectable via command-line arguments.
data_type, quant_mode, preshuffleb, group_sizeto select gemm instanceChecklist
Please put an
xinto the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.clang-formaton all changed filesDiscussion
If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered