Cannot apply dropout on gru layer

Why can’t I use dropout on a GRU layer. I keep getting the following error:

AttributeError                            Traceback (most recent call last)
<ipython-input-37-ffcdd5187a2c> in <module>()
----> 1 pred = net(X)

~/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

<ipython-input-32-82932b82b557> in forward(self, x)
     10     def forward(self, x):
     11         x = self.layer1(x)
---> 12         x = self.layer2(x)
     13         x = self.fc(x)
     14

~/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

~/miniconda3/lib/python3.6/site-packages/torch/nn/modules/dropout.py in forward(self, input)
     44
     45     def forward(self, input):
---> 46         return F.dropout(input, self.p, self.training, self.inplace)
     47
     48     def __repr__(self):

~/miniconda3/lib/python3.6/site-packages/torch/nn/functional.py in dropout(input, p, training, inplace)
    524
    525 def dropout(input, p=0.5, training=False, inplace=False):
--> 526     return _functions.dropout.Dropout.apply(input, p, training, inplace)
    527
    528

~/miniconda3/lib/python3.6/site-packages/torch/nn/_functions/dropout.py in forward(cls, ctx, input, p, train, inplace)
     30             output = input
     31         else:
---> 32             output = input.clone()
     33
     34         if ctx.p > 0 and ctx.train:

AttributeError: 'tuple' object has no attribute 'clone'

Here’s a simple example:

nn.Sequential(
nn.GRU(10, 204),
nn.Dropout(0.5),
nn.Linear(204, 10)
)

What am I doing wrong here?

GRU outputs a tuple. Instead of using Sequential, please write out the forward.

Can you give me a simple example how to transform the above example to make it functional?
Thanks!

def __init__(self, ....):
  ....
  self.gru = nn.GRU(10, 204)
  self.dropout = nn.Dropout(0.5)
  self.fc = nn.Linear(204, 10)

def forward(self, x):
  x, hidden = self.gru(x)
  x = self.dropout(x)
  return self.fc(x)

Ah, got it! Thank you. It would be nice though is this things were automatically taken care for instance in a sequential model. May I ask another question not completely related to the above one but since I’m not a pro pytorch user maybe you might know. Does pytorch offer dynamic unrolling of inputs? something equivalent to tensorflow’s tf.nn.dynamic_rnn?

Since pytorch uses dynamic graph, everything is dynamic. If you are asking about applying the same RNN to variable-length inputs, then yes.

1 Like

@SimonW thanks a lot for the response.
As I am not a pytorch expert I am trying to translate the following tensorflow code into pytorch:

def RNN(x, weight, bias):
       cell = rnn_cell.LSTMCell(n_hidden,state_is_tuple = True)
       cell = rnn_cell.MultiRNNCell([cell] * 2)
       output, state = tf.nn.dynamic_rnn(cell, x, dtype = tf.float32)
       output = tf.transpose(output, [1, 0, 2])
       last = tf.gather(output, int(output.get_shape()[0]) - 1)
       return tf.nn.softmax(tf.matmul(last, weight) + bias)

class RNN(nn.Module):
         def __init__(self, input, n_hidden, out):
               super(RNN, self).__init__()
               self.layer1 = nn.LSTMCell(input, n_hidden)
               self.layer2 = nn.RNNCell(n_hidden, n_hidden)
               self.layer3 = nn.RNN([n_hidden] * 2,  out)

         def forward(self, x, weight, bias):
               cell, h  = self.layer1(x)
               cell, h = self.layer2([cell] * 2)
               cell, h = self.layer3(cell)
              output = cell.transpose(1, 0)
              output = output.gather().shape[0]-1
              return nn.Softmax(torch.matmul(output, weight) + bias)

Am I translating this correctly?

I’m not expert in tf, but it seems that it is doing a two layer LSTM. If that is the case, just use one nn.LSTM object in pytorch.

Thanks @SimonW, much appreciated it! :slight_smile: