I create blocks of layers and combine them in a tree structure. The blocks are stored in a list:
blocks = [block0, block1, block2]
And then, the forward function will hierarchically use these blocks:
-> block1 -> output1 block0-| -> block2 -> output2
When I implement this without recursion, it works:
def forward(self, x): x = self.blocks(x) y1 = self.blocks(x) y2 = self.blocks(x) self.outputs = [y1, y2] return self.outputs
But, when I implement it in recursion, it raises a RuntimeError:
def forward(self, x): x = self.blocks(x) self.traverse(0, x) return self.outputs def traverse(self, now, x): leaves = True if now*2+1 < self.nodes: # self.nodes for this case is 3 y = self.blocks[now*2+1](x) self.traverse(now*2+1, y) leaves = False if now*2+2 < self.nodes: # self.nodes for this case is 3 y = self.blocks[now*2+2](x) self.traverse(now*2+2, y) leaves = False if leaves: self.outputs.append(x)
It raises this error:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [100, 10]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
My training part:
model.train() with torch.autograd.detect_anomaly(): for i in range(epoch): for image, label in tqdm(trainloaders): image = image.to(device) label = label.to(device) out = model(image) out = torch.mean(torch.stack(out), dim=0) loss = criterion(out, label) optimizer.zero_grad() loss.backward(retain_graph=True) optimizer.step()
Can someone help me understand why this error happens, please?