lstm = torch.nn.LSTM(10, 20,1)
lstm.state_dict().keys()
Output result:
Out[47]: odict_keys(['weight_ih_l0', 'weight_hh_l0', 'bias_ih_l0', 'bias_hh_l0'])
According to the calculation process of LSTM, there should be only one bias. Why do we output two bias variables, that is,‘bias_ih_l0’and’bias_hh_l0’?
1 Like
Tony-Y
2
1 Like
sh0416
(Seonghyeon Lee)
3
I think two bias term acts differently.
The main point is that bias_ih
is applied once during the computation along time axis, while bias_hh
is applied accumulated along the time axis.
I want to clarify this one using illustrative example, but the process is so complicate.