Different implementation of square function

I am working on writing a custom convolution function that involves taking the square of some inputs (x). I am wondering what are the differences between different square implementations:

  1. torch.square(x)
  2. torch.pow(x)
  3. x**2
  4. x*x

I would like to know how PyTorch handles each of these operations and if there is any difference in term of gradient computation and efficiency.

Thank you

You can use any of these as they should all dispatch to the same kernel.
A similar question was also asked here.