I am building a simple many-to-many character level LSTM and I came across a very peculiar behavior, hoping that someone could shed some light on it. Here is the code for the LSTM class:
class CharLSTM(nn.Module): def __init__(self, vocab_size, batch_size, sequence_length, hidden_dim, n_layers, drop_p): super(CharLSTM, self).__init__() # init self.hidden_dim = hidden_dim self.vocab_size = vocab_size self.batch_size = batch_size self.sequence_length = sequence_length self.dropout = nn.Dropout(p=drop_p) # define the lstm self.lstm = nn.LSTM(input_size=vocab_size, hidden_size=hidden_dim, num_layers=n_layers, batch_first=True) # define the fully connected layer self.fc = nn.Linear(in_features=hidden_dim, out_features=vocab_size) def forward(self, x): # output of the lstm x, hidden = self.lstm(x) # add dropout on the lstm output x = self.dropout(x) # flat the output from the lstm x = x.view(self.sequence_length * self.batch_size, self.hidden_dim) # calculate the softmax scores scores = self.fc(x) return scores
Here is the thing: if I remove the
x = self.dropout(x) on the LSTM’s output, pytorch raises an exception:
RuntimeError: invalid argument 2: view size is not compatible with input tensor’s size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view(). at c:\programdata\miniconda3\conda-bld\pytorch-cpu_1524541161962\work\aten\src\th\generic/THTensor.cpp:280
With the dropout in place everything works just fine. If someone could advise I would highly appreciate it.