Channels last question

Thanks a lot. Setting torch.backends.cudnn.allow_tf32 = False will give a diff of 1e-6, which is indeed a reduced value. So is it possible to use channel last to accelerate training/finetuning? If the difference is not negligible, will it give a non-reasonable output after training or incur divergent issue?

Yes, it’s possible to use channels-last to accelerate workloads and we didn’t see divergence in cuDNN’s TF32 usage.

1 Like

Thanks. For this purpose, is that enough to just change the format and models as in the above but enable TF32, and no other changes are necessary?