I am trying to write biLSTM.
I set many tensors and let them all placed in tensor-matrix.
When I try to run
loss.backward()
and I get an error like
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [50, 1]], which is output 0 of SelectBackward, is at version 21; expected version 20 instead.
def lstm(index):
global Wf_forward, Wi_forward, Wc_forward, Wo_forward
global bf_forward, bi_forward, bc_forward, bo_forward
global Wf_backward, Wi_backward, Wc_backward, Wo_backward
global bf_backward, bi_backward, bc_backward, bo_backward
global Ws_backward, Ws_forward
h_forward = torch.randn(lstm_len + 2, Dh, 1) # I set lstm_len = 20, Dh = 50
f_forward = torch.randn(lstm_len + 2, Dh, 1)
i_forward = torch.randn(lstm_len + 2, Dh, 1)
c_forward = torch.randn(lstm_len + 2, Dh, 1)
C_forward = torch.randn(lstm_len + 2, Dh, 1)
o_forward = torch.randn(lstm_len + 2, Dh, 1)
for i in range(lstm_len + 1): # I set lstm_len = 20
k_index = index - lstm_len + i
k = i + 1
index_k = np.array([k_index])
_x = get_x(exercise_list, index_k, 1)
f_forward[k] = torch.sigmoid( torch.mm(Wf_forward, torch.cat((h_forward[k-1], _x), 0)) + bf_forward )
i_forward[k] = torch.sigmoid( torch.mm(Wi_forward, torch.cat((h_forward[k-1], _x), 0)) + bi_forward )
c_forward[k] = torch.tanh( torch.mm(Wc_forward, torch.cat((h_forward[k-1], _x), 0)) + bc_forward )
C_forward[k] = f_forward[k] * C_forward[k-1] + i_forward[k] * c_forward[k]
o_forward[k] = torch.sigmoid( torch.mm(Wo_forward, torch.cat((h_forward[k-1], _x), 0)) + bo_forward )
h_forward[k] = o_forward[k] * torch.tanh(C_forward[k])
h_backward = torch.randn(lstm_len + 2, Dh, 1)
f_backward = torch.randn(lstm_len + 2, Dh, 1)
i_backward = torch.randn(lstm_len + 2, Dh, 1)
c_backward = torch.randn(lstm_len + 2, Dh, 1)
C_backward = torch.randn(lstm_len + 2, Dh, 1)
o_backward = torch.randn(lstm_len + 2, Dh, 1)
for i in range(lstm_len + 1):
k_index = index + lstm_len - i
k = i + 1
index_k = np.array([k_index])
_x = get_x(exercise_list, index_k, 1)
f_backward[k] = torch.sigmoid(torch.mm(Wf_backward, torch.cat((h_backward[k - 1], _x), 0)) + bf_backward)
i_backward[k] = torch.sigmoid(torch.mm(Wi_backward, torch.cat((h_backward[k - 1], _x), 0)) + bi_backward)
c_backward[k] = torch.tanh(torch.mm(Wc_backward, torch.cat((h_backward[k - 1], _x), 0)) + bc_backward)
C_backward[k] = f_backward[k] * C_backward[k - 1] + i_backward[k] * c_backward[k]
o_backward[k] = torch.sigmoid(torch.mm(Wo_backward, torch.cat((h_backward[k - 1], _x), 0)) + bo_backward)
h_backward[k] = o_backward[k] * torch.tanh(C_backward[k])
yt = torch.softmax(torch.mm(Ws_forward, h_forward[lstm_len+1]) + torch.mm(Ws_backward, h_backward[lstm_len+1]), dim=0, dtype=torch.float32)
loss = torch.mean(Y * torch.log(yt))
Loss.append(loss)
loss.backward()
I know that I’ve changed the value of several tensor-matrixes. So finally, it can’t compute the gradient. However, I use matrixes in order to store data instead of intending to change their values. How can I solve this problem?
Thanks for anybody’s response.