Transpose_dim kernel

Hi,

I’ve dived into the codebase, but I couldn’t find how PyTorch handles transpose_dim operation on GPU machines. How does it handle? Where exactly the implementation is?

Thanks.