Hello!
I have a model, through which i run inference. It uses Conv3d. When i try to run inference for out_channels=14, the time taken by it is 60ms but when i try to run it with out_channels=24, the time taken by it 100ms. Is there any way to optimize it.
In case you are using a GPU you could try to enable cuDNN benchmarking via torch.backends.cudnn.benchmark = True
which would profile different kernels for your workload and would pick the fastest one. Using to(memory_format=torch.channels_last_3d)
would also help in case you are using mixed-precision training.
1 Like
Thank you so much. It worked, the performance increased. but just wanted to confirm Will it have any effect on accuracy?
No, the accuracy should never be impacted by the algorithm choice. You would not expect to see bitwise identical results due to the limited floating point precision, but your accuracy should also not be sensitive to these expected small errors.
1 Like
Thank you so much, it helped me a lot!
1 Like