When using dataparallel to train my LSTM with multi-gpus, I receive this error,
anaconda3/lib/python3.5/site-packages/torch/nn/parallel/parallel_apply.py:42 : UserWarning: RNN module weights are not part of single contiguous chunk of memory. This mea ns they need to be compacted at every call, possibly greately increasing memory usage. To com pact weights again call flatten_parameters().
output = module(*input, **kwargs)
But I call flatten_parameters in the forward function.
Here’s my code below:
def forward(self, x):
x = self.share.forward(x)
x = x.view(-1, 2048)
x = x.view(-1, sequence_length, lstm_in_dim)
x = x.permute(1, 0, 2)
self.lstm.flatten_parameters()
y, self.hidden = self.lstm(x, self.hidden)
self.lstm.flatten_parameters()
y = y.contiguous().view(1, sequence_length, -1, lstm_out_dim)
y = y.permute(0, 2, 1, 3).contiguous()
y = y.view((-1, lstm_out_dim))
y = self.bn(y)
y = self.fc(y)
return y
I am new to pytorch. Can anyone here help me with this problem?