Interesting, when I move the entire imagenet folder to SSD, not all the processes go into D (still 1 or 4 go into D), but the iter speed is normal at 50 seconds. It seems that something is wrong with HDD io.
same solve method with Strange behavior in Pytorch
But I think this is not a good way because ImageNet is 140g, while my SDD is only 2T. Also, this server is new, I bought it 4 months ago. I don’t think it is something wrong with HDD