I’m playing around with the LSTM architecture and I added another set of weights, the same dimension as the original set.
def LSTMCell(input, hidden, w_ih, w_ih2, w_hh, w_hh2, b_ih=None, b_hh=None): hx, cx = hidden w_ih = torch.mul(w_ih2, w_ih) w_hh = torch.mul(w_hh2, w_hh)
I get a
RuntimeError: cuda runtime error (2) : out of memory at c:\programdata\miniconda3\conda-bld\pytorch_1524543037166\work\aten\src\thc\generic/THCStorage.cu:58 at line:
w_hh = (w_hh2 * w_hh)
Anybody have any idea why this is happening?
Does element wise multiplication of Tensors (correction: Matrices) create an unnecessary gradient build up?
(I need to use element wise multiplication to keep the original Tensor (correction: Matrix) size)
Any help would be appreciated. Thanks!!