No, torch.flatten() function does not copy any data, and actually it behaves more like a wrapper around the view() function. Simple way to prove it without having any explicit mention of it in the docs is by running the following lines of code:

# Create (2, 3, 4) shape data tensor filled with 0.
a = torch.zeros(2, 3, 4)
# Flatten 2nd and 3rd dimensions of the original data
# tensor using `view` and `flatten` methods.
b = a.view(2, 12)
c = torch.flatten(a, start_dim=1)
# Change a distinct value in each flattened tensor object.
b[0, 2] = 1
c[0, 4] = 2
# Compare tensors objects data to each other to look for
# any mismatches.
print("Tensors A and B data match?", all(a.view(-1) == b.view(-1)))
print("Tensors A and C data match?", all(a.view(-1) == c.view(-1)))
print("Tensors B and C data match?", all(b.view(-1) == c.view(-1)))

Output:

Tensors A and B data match? True
Tensors A and C data match? True
Tensors B and C data match? True

Yes, but the difference is negligible in practice. The overhead that flatten() function introduces is only from its internal simple computation of the tensor’s output shape and the actual call to the view() method or similar. This difference is in less than 1μs.