Thanks for your reply.
In this case, I figured out that the sm75 kernel was getting selected when the input tensor was not contiguous. If the tensor was contiguous it was always choosing the sm80 kernel.
Thanks for your reply.
In this case, I figured out that the sm75 kernel was getting selected when the input tensor was not contiguous. If the tensor was contiguous it was always choosing the sm80 kernel.