Skip to content

Conversation

@do-jason
Copy link
Collaborator

@do-jason do-jason commented Apr 3, 2025

Add macro function to replace inline function (TRANSLATE_PIXEL_2D, TRANSLATE_PIXEL_3D)
Add "USE_SINCOS_TABLE" macro to CPU acceleration code for alternative code path
Add "#pragma forceinline" for project2Dmodel/project3Dmodel functions
Add alignas in CPU kernel function's arrays
Add "#if _OPENMP" to protect "#pragma omp simd" pragmas from old compiler
Add "#pragma omp simd simdlen" for some cases
Some minor code refactoring
Remove "attribute((always_inline))" since "inline" is enough
Remove DEBUG_CUDA
Remove unused no_tex2D and no_tex3D in acc/cpu code path
Remove commented old code block in diff2.h

Compilation checked with GNU compiler 7~13, Intel Classic Compiler, and various Intel oneAPI compiler versions.

@biochem-fan
Copy link
Member

@do-jason Thank you very much!

Regarding Add --enable-avx2 --enable-avx512 --enable-fma for FFTW build: what happens with consumer CPUs without AVX512? Will the binary run fine without using AVX512 or cause illegal instruction errors?

FFTW documentation says:

Enable various SIMD instruction sets. You need compiler that supports the given SIMD extensions, but FFTW will try to detect at runtime whether the CPU supports these extensions.
That is, you can compile with--enable-avx and the code will still run on a CPU without AVX support.
https://www.fftw.org/fftw3_doc/Installation-on-Unix.html

So it seems safe but am I correct?

@biochem-fan
Copy link
Member

With this patch, is the performance of oneAPI compilers approaching Classic Intel compiler?

@do-jason
Copy link
Collaborator Author

do-jason commented Apr 3, 2025

@do-jason Thank you very much!

Regarding Add --enable-avx2 --enable-avx512 --enable-fma for FFTW build: what happens with consumer CPUs without AVX512? Will the binary run fine without using AVX512 or cause illegal instruction errors?

FFTW documentation says:

Enable various SIMD instruction sets. You need compiler that supports the given SIMD extensions, but FFTW will try to detect at runtime whether the CPU supports these extensions.
That is, you can compile with--enable-avx and the code will still run on a CPU without AVX support.
https://www.fftw.org/fftw3_doc/Installation-on-Unix.html

So it seems safe but am I correct?

It will just enable features in FFTW library, and it will be selected if these features are supported at runtime. So, enabling it is safe.

@do-jason
Copy link
Collaborator Author

do-jason commented Apr 3, 2025

@do-jason Thank you very much!

Regarding Add --enable-avx2 --enable-avx512 --enable-fma for FFTW build: what happens with consumer CPUs without AVX512? Will the binary run fine without using AVX512 or cause illegal instruction errors?

FFTW documentation says:

Enable various SIMD instruction sets. You need compiler that supports the given SIMD extensions, but FFTW will try to detect at runtime whether the CPU supports these extensions.
That is, you can compile with--enable-avx and the code will still run on a CPU without AVX support.
https://www.fftw.org/fftw3_doc/Installation-on-Unix.html

So it seems safe but am I correct?

It will just enable features and it will choose best instruction set at runtime. So, enabling it is safe.

With this patch, is the performance of oneAPI compilers approaching Classic Intel compiler?

In my experiment, the difference became single digit percent using Plasmodium Ribosome workload. There could be compiler improvement, but I cannot guess when it will be possible. This is a workaround for the time being.

@biochem-fan biochem-fan changed the base branch from master to ver5.0 April 3, 2025 07:07
@biochem-fan biochem-fan merged commit 04cb97d into ver5.0 Apr 3, 2025
0 of 4 checks passed
@biochem-fan
Copy link
Member

Thanks for clarification.

Compilation checked with GNU compiler 7~13, Intel Classic Compiler, and various Intel oneAPI compiler versions.

Trusting this, I will merge this into the ver5.0 branch. If nobody reports errors, I will merge this to master and include in the 5.0.1 minor update.

Thanks again for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants