def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
In model forward passes, if we use tensor transformations such as view(as shown in the example), transpose, permute,reshape - does this effect grad values?
If no, how the mapping between the original and transformed tensor are maintained?
Yes, all the mentioned operations will be tracked by Autocrat such that the gradient will be passed to the corresponding values during the backward pass.
@ptrblck As a matter of interest, are there any operations that AREN’T tracked by Autocrat that we should keep an eye out for? I work under the assumption that ANY function/processing bundled in pytorch is tracked - is that a reasonable assumption?
Yes, I think this is generally a valid assumption and should be true as long as you don’t use e.g. the .data attribute of a tensor. If the operation is not differentiable or not implemented, Autograd would raise an error.
Also, interesting autocorrect when trying to type “Autograd”, but I think in the end my smartphone might know who has the absolute power in PyTorch land.