I’m playing around with the LSTM architecture and I added another set of weights, the same dimension as the original set.
def LSTMCell(input, hidden, w_ih, w_ih2, w_hh, w_hh2, b_ih=None, b_hh=None):
hx, cx = hidden
w_ih = torch.mul(w_ih2, w_ih)
w_hh = torch.mul(w_hh2, w_hh)
I get a RuntimeError: cuda runtime error (2) : out of memory at c:\programdata\miniconda3\conda-bld\pytorch_1524543037166\work\aten\src\thc\generic/THCStorage.cu:58
at line: w_hh = (w_hh2 * w_hh)
Anybody have any idea why this is happening?
Does element wise multiplication of Tensors (correction: Matrices) create an unnecessary gradient build up?
(I need to use element wise multiplication to keep the original Tensor (correction: Matrix) size)
Any help would be appreciated. Thanks!!