I'm encountering a compilation error while trying to build Apex with CUDA 12.6 on Windows. The error occurs in multi_tensor_axpby_kernel.cu, specifically related to isfinite< ::c10::Half>, which is undefined in the device code.
System Information:
OS: Windows 11
Python Version: (e.g., 3.12)
CUDA Version: 12.6
PyTorch Version: (e.g., torch 2.x)
Compiler: MSVC / nvcc