CRNN: one of the variables needed for gradient computation has been modified by an inplace operation

I think the backward call might be raising this issue, since you are using retain_graph=True while also updating the parameters, which could be a similar issue to this one.