I have a RNN model. I did something like
self.lstm.flatten_parameters()
with DataParallel on multiple GPUs in order to eliminate a user warning.
UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters().
However, I got this error
RuntimeError: torch/csrc/autograd/variable.cpp:115: get_grad_fn: Assertion `output_nr == 0` failed.
Any ideas on this? Is it a bug for pytorch?