PyTorch CPU overhead of creating conv2d layers

Actually, since my input size is fixed, based on What does torch.backends.cudnn.benchmark do? I am using

torch.backends.cudnn.benchmark = True

I tried both and bechmark=True gives very slightly faster times than

torch.backends.cudnn.deterministic = True