# approach 1
x = torch.randn(n, d).requires_grad_()
y = torch.randn(d)
z = x * y
# approach 2
x = torch.randn(n, d).requires_grad_()
y = torch.randn(d).repeat(n, 1)
z = x * y
Hey guys! In the first approach, does torch internally repeat y
or does it efficiently implement the multiplication without extra memory (especially in the computational graph)? are there trade-offs between the two approaches?