When I wrap b (just does some indexing operations) in torch.no_grad(), I get an error RuntimeError: No grad accumulator for a saved leaf!. It works when I don’t wrap it, or when I change forward to concat the inputs before calling self.fc (a fully connected layer, of course the input dimension is changed when I do that).
What’s going on? And is wrapping with torch.no_grad() the right thing when I don’t want some operations to have an effect on the gradient?
Sorry I should have been clearer, a, b and c are not variables but each a group of operations (either a function or a nn.Module::forward). The point I was trying to make was that first I do operations where I want the gradient to be tracked (in a), then I do some where I don’t want them to be tracked (b), then I want them to be tracked again (c).