Hi,
I have been running into a problem using the torchnlp WeightDrop functions in my model while training. Specifically, after initializing the WeightDropLinear layer, the code fails with the error:
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1438, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/georgestoica/Desktop/icloud_desktop/Research/gcn-over-pruned-trees/train.py", line 189, in <module>
loss = trainer.update(batch)
File "/Users/georgestoica/Desktop/icloud_desktop/Research/gcn-over-pruned-trees/model/trainer.py", line 80, in update
logits, pooling_output = self.model(inputs)
File "/opt/anaconda3/envs/ENAS-pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/Users/georgestoica/Desktop/icloud_desktop/Research/gcn-over-pruned-trees/model/gcn.py", line 29, in forward
outputs, pooling_output = self.gcn_model(inputs)
File "/opt/anaconda3/envs/ENAS-pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/Users/georgestoica/Desktop/icloud_desktop/Research/gcn-over-pruned-trees/model/gcn.py", line 130, in forward
tree_encodings, pool_mask = self.tree_lstm_wrapper(tree_adj, inputs, max_depth, max_bottom_offset)
File "/opt/anaconda3/envs/ENAS-pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/Users/georgestoica/Desktop/icloud_desktop/Research/gcn-over-pruned-trees/model/gcn.py", line 230, in forward
lstm_inputs = tree_lstm(lstm_inputs, trees, mask, max_depth)
File "/opt/anaconda3/envs/ENAS-pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/Users/georgestoica/Desktop/icloud_desktop/Research/gcn-over-pruned-trees/model/tree_lstm.py", line 129, in forward
child_mask
File "/Users/georgestoica/Desktop/icloud_desktop/Research/gcn-over-pruned-trees/model/tree_lstm.py", line 82, in step
h_iou = self.h_iou(h_j) # [B,T1,3H]
File "/opt/anaconda3/envs/ENAS-pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/opt/anaconda3/envs/ENAS-pytorch/lib/python3.7/site-packages/torchnlp/nn/weight_drop.py", line 22, in forward
setattr(module, name_w, w)
File "/opt/anaconda3/envs/ENAS-pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 604, in __setattr__
.format(torch.typename(value), name))
TypeError: cannot assign 'torch.FloatTensor' as parameter 'weight' (torch.nn.Parameter or None expected)
From the error, it seems the problem is caused by this logic in the forward function torchnlp/nn/weight_drop.py:
def forward(*args, **kwargs):
for name_w in weights:
raw_w = getattr(module, name_w + '_raw')
w = torch.nn.functional.dropout(raw_w, p=dropout, training=module.training)
setattr(module, name_w, w) # line 22 (my comment)
return original_module_forward(*args, **kwargs)
Line 22 attempts to set “w” to be the linear layer’s weight. However, the error above occurs because the torch.nn.functional.dropout() application turns raw_w from a parameter to a FloatTensor, and thus cannot replace the linear layer’s original weight parameter tensor.
One immediate solution I thought might be to change line 22 to:
setattr(module, name_w, Parameter(w))
However I am unsure of the implications of this during training. Is this a valid solution? Or is there something else that would be better?
Also, I believe (from my understanding of “setattr” <-- I couldn’t find the source code for this) that the following assignment which throws the same error also illustrates this problem:
l = torch.nn.Linear(1, 3)
l.weight = torch.nn.functional.dropout(l.weight, p=.5, training=True)
I believe this example emulates what “setattr” is doing (though I could be wrong here as I don’t know exactly what “setattr” source code’s is.
I am on pytorch version 1.3.1, torchnlp version 0.5.0, and python 3.7.
Any help would be greatly appreciated!
Thanks!