Channels last question

deJQK · September 16, 2023, 5:45pm

Thanks a lot. Setting torch.backends.cudnn.allow_tf32 = False will give a diff of 1e-6, which is indeed a reduced value. So is it possible to use channel last to accelerate training/finetuning? If the difference is not negligible, will it give a non-reasonable output after training or incur divergent issue?

ptrblck · September 16, 2023, 6:02pm

Yes, it’s possible to use channels-last to accelerate workloads and we didn’t see divergence in cuDNN’s TF32 usage.

deJQK · September 16, 2023, 6:07pm

Thanks. For this purpose, is that enough to just change the format and models as in the above but enable TF32, and no other changes are necessary?