I am so sorry for my incorrect description. dataset images “1000w” means ten million images, not the width pixel, sorry.
i will try the latest pytorch docker image.
I just tried a smaller batchszie, and there was no OOM error, but the cudnn error also exits.