Hello
I’m trying to create a system that will perform action recognition. I’m trying to use a combination of CNN and RNN. For CNN I am using ShuffleNet v2. I replaced the fc
of the model with my custom nn.Sequential
:
rcnn_layer = nn.Sequential(
nn.Linear(1024, 128),
View((1, 128)),
InputModifier(10),
nn.RNN(input_size=128, hidden_size=64),
GetLastHidden(),
nn.Linear(64, 3),
nn.Softmax(dim=-1)
)
The custom classes are:
View
- reshapes the output of the first linear layer so that it fits the RNN
InputModifier
- retains previous outputs of the model
GetLastHidden
- returns last layer of the RNN
In the future I’d like to use the model on a phone (somehow), so I’m trying to make it as efficient as I can. Therefore, in order to make it run faster, in my InputModifier
class I’m retaining the previous outputs of the CNN. Here’s the code:
class InputModifier(nn.Module):
prev = []
def __init__(self, max_seq_len):
assert max_seq_len != 0, '`max_seq_len` cannot be 0.'
super(InputModifier, self).__init__()
self.max_seq_len = max_seq_len
def forward(self, x):
self.prev.append(x)
if self.max_seq_len > 0:
self.prev = self.prev[-self.max_seq_len:]
inp = torch.cat(self.prev)
return inp
The issue is: I don’t know how to get it to work. When I run loss.backward()
, I get RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed.
error.
If i set loss.backward(retain_graph=True)
, I get RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1024, 128]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
.
PS: The training code:
model = model.to(device)
optim = torch.optim.Adam(lr=0.01, params=model.parameters())
criterion = nn.CrossEntropyLoss()
model.train()
for i, (img, c) in enumerate(dataloader):
out = model(img)
loss = criterion(out, c.view(1))
optim.zero_grad()
loss.backward()
optim.step()
All advices are welcome. Thanks.