I know setting
torch.backends.cudnn for fixed input size improves performance for GPU inference. But if i want to speed up inference just on CPU does this help (for fixed input size)?
No, cudnn is a library on top of CUDA and works only on GPUs.
For CPU performance improvement, you could use e.g. MKL.