-
Notifications
You must be signed in to change notification settings - Fork 111
AIMIGRAPHX-414 Use HSA runtime to query number of chiplets (Linux only) #4496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
| bool found; | ||
| }; | ||
|
|
||
| hsa_status_t status = hsa_init(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't do that here: https://github.com/ROCm/rocMLIR/blob/1f672e426c688289265eb94f2ab092d48b0690e4/mlir/lib/Dialect/Rock/IR/AmdArchDb.cpp#L214
is it needed?
| // Callback function for hsa_iterate_agents | ||
| // GPUs are enumerated in the same order as HIP device IDs | ||
| auto agent_callback = [](hsa_agent_t agent, void* data) -> hsa_status_t { | ||
| auto* info = static_cast<agent_info*>(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it confusing that we have info variable in line 238 as well
| info.found = false; | ||
|
|
||
| // Callback function for hsa_iterate_agents | ||
| // GPUs are enumerated in the same order as HIP device IDs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this always the case? any link to the docs?
| }; | ||
|
|
||
| // Iterate through all HSA agents to find matching GPU | ||
| status = hsa_iterate_agents(agent_callback, &info); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check status
| }; | ||
|
|
||
| hsa_status_t status = hsa_init(); | ||
| if(status != HSA_STATUS_SUCCESS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would refactor this into a macro called RET_IF_HSA_ERR and reuse it
| std::size_t target_device_id; | ||
| std::size_t gpu_count; | ||
| uint32_t num_chiplets; | ||
| bool found; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like found is set but not used. Do we want to check if !found at the end of the agent enumeration and throw an error in that case?
| // HSA is only available on non-Windows platforms | ||
| #ifndef _WIN32 | ||
| #include "hsa/hsa.h" | ||
| #include "hsa/hsa_ext_amd.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Includes should use angle brackets <..>.
| status = hsa_iterate_agents(agent_callback, &info); | ||
|
|
||
| hsa_shut_down(); | ||
| return info.num_chiplets; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move all the HSA to a seperate function in a .cpp file? As this gets included by everyone.
| // Iterate through all HSA agents to find matching GPU | ||
| status = hsa_iterate_agents(agent_callback, &info); | ||
|
|
||
| hsa_shut_down(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to call init and shut_down everytime? That might be expensive. We should probably collect all chiplets counts for all devices and store it in vector so we can query only once.
Also if this is necessary(I am not sure this is the case as hip still needs to use hsa) then this should be called in a destructor so its always called. Could make a class that calls hsa_init in the constructor and hsa_shut_down in the desctuctor.
|
cmake needs to be updated to call |
Motivation
Architectures like MI300 series can have varying number of chiplets based on the compute partition mode.
Technical Details
Use HSA runtime to get number of chiplets. Similar to what rocMLIR does in native mode. Currently rocMLIR will not consume this, but they are working on adding that support and it's not blocking for this PR.
Changelog Category