LSTM's autograd error about an inplace operation

I am trying to write biLSTM.
I set many tensors and let them all placed in tensor-matrix.
When I try to run

loss.backward()

and I get an error like

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [50, 1]], which is output 0 of SelectBackward, is at version 21; expected version 20 instead.
def lstm(index):
    global Wf_forward, Wi_forward, Wc_forward, Wo_forward
    global bf_forward, bi_forward, bc_forward, bo_forward
    global Wf_backward, Wi_backward, Wc_backward, Wo_backward
    global bf_backward, bi_backward, bc_backward, bo_backward
    global Ws_backward, Ws_forward
    h_forward = torch.randn(lstm_len + 2, Dh, 1) # I set lstm_len = 20, Dh = 50
    f_forward = torch.randn(lstm_len + 2, Dh, 1)
    i_forward = torch.randn(lstm_len + 2, Dh, 1)
    c_forward = torch.randn(lstm_len + 2, Dh, 1)
    C_forward = torch.randn(lstm_len + 2, Dh, 1)
    o_forward = torch.randn(lstm_len + 2, Dh, 1)
    for i in range(lstm_len + 1): # I set lstm_len = 20
        k_index = index - lstm_len + i
        k = i + 1
        index_k = np.array([k_index])
        _x = get_x(exercise_list, index_k, 1)
        f_forward[k] = torch.sigmoid( torch.mm(Wf_forward, torch.cat((h_forward[k-1], _x), 0)) + bf_forward )
        i_forward[k] = torch.sigmoid( torch.mm(Wi_forward, torch.cat((h_forward[k-1], _x), 0)) + bi_forward )
        c_forward[k] = torch.tanh( torch.mm(Wc_forward, torch.cat((h_forward[k-1], _x), 0)) + bc_forward )
        C_forward[k] = f_forward[k] * C_forward[k-1] + i_forward[k] * c_forward[k]
        o_forward[k] = torch.sigmoid( torch.mm(Wo_forward, torch.cat((h_forward[k-1], _x), 0)) + bo_forward )
        h_forward[k] = o_forward[k] * torch.tanh(C_forward[k])
    h_backward = torch.randn(lstm_len + 2, Dh, 1)
    f_backward = torch.randn(lstm_len + 2, Dh, 1)
    i_backward = torch.randn(lstm_len + 2, Dh, 1)
    c_backward = torch.randn(lstm_len + 2, Dh, 1)
    C_backward = torch.randn(lstm_len + 2, Dh, 1)
    o_backward = torch.randn(lstm_len + 2, Dh, 1)
    for i in range(lstm_len + 1):
        k_index = index + lstm_len - i
        k = i + 1
        index_k = np.array([k_index])
        _x = get_x(exercise_list, index_k, 1)
        f_backward[k] = torch.sigmoid(torch.mm(Wf_backward, torch.cat((h_backward[k - 1], _x), 0)) + bf_backward)
        i_backward[k] = torch.sigmoid(torch.mm(Wi_backward, torch.cat((h_backward[k - 1], _x), 0)) + bi_backward)
        c_backward[k] = torch.tanh(torch.mm(Wc_backward, torch.cat((h_backward[k - 1], _x), 0)) + bc_backward)
        C_backward[k] = f_backward[k] * C_backward[k - 1] + i_backward[k] * c_backward[k]
        o_backward[k] = torch.sigmoid(torch.mm(Wo_backward, torch.cat((h_backward[k - 1], _x), 0)) + bo_backward)
        h_backward[k] = o_backward[k] * torch.tanh(C_backward[k])
    yt = torch.softmax(torch.mm(Ws_forward, h_forward[lstm_len+1]) + torch.mm(Ws_backward, h_backward[lstm_len+1]), dim=0, dtype=torch.float32)
    loss = torch.mean(Y * torch.log(yt))
    Loss.append(loss)
    loss.backward()

I know that I’ve changed the value of several tensor-matrixes. So finally, it can’t compute the gradient. However, I use matrixes in order to store data instead of intending to change their values. How can I solve this problem?
Thanks for anybody’s response.

Do you have a copy-paste error or are you executing the for loop over lstm_len+1 twice?
If that’s not an error, then the second loop would overwrite the values, which raises this error.
Could you explain the use case why you would need to create these values twice?

I use BiLSTM so I execute the loop twice. One is the forward process and the other is the backward process. However, the parameters are different, like f_forward and f_backward.
Maybe there is a better way to write the code that I don’t know cause I’m a beginner in PyTorch.
Thanks a lot for your response.

Ah, I missed the naming, so thanks for the information.
Could you post an executable code snippet, which yields this error (using random inputs)?