-
Notifications
You must be signed in to change notification settings - Fork 227
CPU kernel patch for oneAPI compiler performance workaround and code refactoring #1258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
C++ compiler match with "icpx" for better flexibility
Also removed some "inline" since it is not necessary
…iler Add "simd simdlen" for most OpenMP SIMD pragmas
Add macro function to replace inline function (TRANSLATE_PIXEL_2D/TRANSLATE_PIXEL_3D) Add "USE_SINCOS_TABLE" macro to CPU acceleration code for alternative code path Add "#pragma forceinline" for project2Dmodel/project3Dmodel functions Some minor code refactoring
|
@do-jason Thank you very much! Regarding FFTW documentation says:
So it seems safe but am I correct? |
|
With this patch, is the performance of oneAPI compilers approaching Classic Intel compiler? |
It will just enable features in FFTW library, and it will be selected if these features are supported at runtime. So, enabling it is safe. |
It will just enable features and it will choose best instruction set at runtime. So, enabling it is safe.
In my experiment, the difference became single digit percent using Plasmodium Ribosome workload. There could be compiler improvement, but I cannot guess when it will be possible. This is a workaround for the time being. |
|
Thanks for clarification.
Trusting this, I will merge this into the Thanks again for your contribution. |
Add macro function to replace inline function (TRANSLATE_PIXEL_2D, TRANSLATE_PIXEL_3D)
Add "USE_SINCOS_TABLE" macro to CPU acceleration code for alternative code path
Add "#pragma forceinline" for project2Dmodel/project3Dmodel functions
Add alignas in CPU kernel function's arrays
Add "#if _OPENMP" to protect "#pragma omp simd" pragmas from old compiler
Add "#pragma omp simd simdlen" for some cases
Some minor code refactoring
Remove "attribute((always_inline))" since "inline" is enough
Remove DEBUG_CUDA
Remove unused no_tex2D and no_tex3D in acc/cpu code path
Remove commented old code block in diff2.h
Compilation checked with GNU compiler 7~13, Intel Classic Compiler, and various Intel oneAPI compiler versions.