I’m confused that you get the same results here. (e.g., in this code snippet the results are clearly not the same)
$ cat temp.py
import torch
a = torch.randn(1, 8, 64, 1024)
b = a.reshape(1, 512, 1024)
c = a.permute(0, 2, 1, 3).reshape(1, 512, 1024)
a = a.view(1, 512, 1024)
print(torch.allclose(a,b))
print(torch.allclose(b,c))
$ python3 temp.py
True
False
$
In summary permute is very different from view and reshape in that it actually changes the data layout or ordering of elements (e.g., consider what happens as you access each element by incrementing the last index by 1).
This post For beginners: Do not use view() or reshape() to swap dimensions of tensors! - PyTorch Forums is a great intro to the pitfalls of using view
or reshape
when the intent is to change the ordering of elements.