Hello! I have an LSTM which looks like this:
class BD_LSTM(nn.Module):
def __init__(self, nl):
super().__init__()
self.nl = nl
self.rnn = nn.LSTM(1, n_hidden, nl, bidirectional=True)
self.l_out = nn.Linear(n_hidden*2, n_classes)
self.init_hidden(bs)
def forward(self, input):
outp,h = self.rnn(input.view(len(input), bs, -1), self.h)
return F.log_softmax(self.l_out(outp),dim=2)
def init_hidden(self, bs):
self.h = (V(torch.zeros(self.nl*2, bs, n_hidden)),
V(torch.zeros(self.nl*2, bs, n_hidden)))
model = TESS_LSTM(2).cuda()
I want to initialize the hidden layer of the LSTM to the identity matrix (I read this is better for convergence purposes). When I run this:
?model.rnn
one of the attributes (which should be the one I need) is this: weight_hh_l. This is the full output relevant for my question:
Attributes:
weight_ih_l[k] : the learnable input-hidden weights of the :math:`\text{k}^{th}` layer
`(W_ii|W_if|W_ig|W_io)`, of shape `(4*hidden_size x input_size)`
weight_hh_l[k] : the learnable hidden-hidden weights of the :math:`\text{k}^{th}` layer
`(W_hi|W_hf|W_hg|W_ho)`, of shape `(4*hidden_size x hidden_size)`
bias_ih_l[k] : the learnable input-hidden bias of the :math:`\text{k}^{th}` layer
`(b_ii|b_if|b_ig|b_io)`, of shape `(4*hidden_size)`
bias_hh_l[k] : the learnable hidden-hidden bias of the :math:`\text{k}^{th}` layer
`(b_hi|b_hf|b_hg|b_ho)`, of shape `(4*hidden_size)`
However, when I do this:
model.rnn.weight_hh_l[0].data.copy_(torch.eye(n_hidden))
I am getting this error:
AttributeError: 'LSTM' object has no attribute 'weight_hh_l'
What am I doing wrong? Thank you!