Hi, thanks for the great work on fCWT!
I noticed that on my machine (R9 7945HX, 32 threads), setting nthreads=8 gives the best performance. Using more threads (e.g. 32) makes it slower.
Is this expected? Could performance with higher thread counts be improved?
Thanks in advance!