Permute vs Transpose

I understand that there are a couple of posts that explain the difference between permute and transpose verbally. But is anyone aware of a visual explanation that shows the difference between the two, perhaps with an example tensor? (I would also be super grateful if someone could also make a visual explanation :hugs: - it would help me really internalise the concept). Thanks in advance!!

I would claim that the difference is small and while transpose expects two dimensions, permute expects an input corresponding to all dims. However, you could achieve the same results if you are only permuting two dimensions as seen here:

x = torch.arange(2*3*4*5).view(2, 3, 4, 5)
print(x)

y1 = x.transpose(1, 2)
print(y1)
y2 = x.permute(0, 2, 1, 3)
print(y2)

print((y1 == y2).all())
# tensor(True)