Skip to content

Conversation

@do-jason
Copy link
Collaborator

@do-jason do-jason commented Apr 8, 2025

  • Merge mdlReal and mdlImag to a single array with doubled size for cache efficient access
  • Make required change and some cleanups for this mdlReal/mdlImag to mdlComplex in the required files in src/acc/ directory
  • Change no_tex2D and no_tex3D to this mdlReal and mdlImag merging optimization (src/acc/sycl/sycl_kernels/sycl_utils.h)
  • Remove unused complex2D and complex3D
  • Change the location of group_barrier to remove unnecessary expensive barrier instruction

Compiling and running are checked for legacy, CPU, and CUDA acceleration code path.

do-jason added 2 commits April 7, 2025 18:07
 Merge mdlReal and mdlImag to a single array with doubled size for cache efficient access
 Make required change and some cleanups for this mdlReal/mdlImag to mdlComplex in the required files in src/acc/ directory
 Change no_tex2D and no_tex3D to this mdlReal and mdlImag merging optimization (src/acc/sycl/sycl_kernels/sycl_utils.h)
 Remove unused complex2D and complex3D
@biochem-fan
Copy link
Member

biochem-fan commented Apr 8, 2025

Thank you very much for your continued contribution.

The change looks OK to me but I do not have time to test this on many datasets.
I will tentatively merge this to the ver5.0 branch and see if anyone reports bugs.
If nothing comes in, I will merge this to master and include in 5.0.1.

@biochem-fan biochem-fan merged commit 3d6c200 into ver5.0 Apr 8, 2025
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants