My CPU's thread count is 512 (I checked through the nproc command),
: 512 = 128core x 2 sockets x2(HyperThreading on)
It seems that this tool does not utilize as much as the actual number of nproc.
As you can see in the capture below, reader.read_threads parameter can only be allocated up to 128.
I checked again, and I think it only sweeps up to 128 threads in any environment. I don't know if it's a bottleleneck problem due to DLIO.
However, according to the run parameter guide you told me in #202, I think the total number of cores from the nproc command should be allocated to read_thread.
Could you check this problem?
