I use this line to get the index of first 0 value in the rows of a tensor:
length = torch.LongTensor([(x[i,:,0] == 0).nonzero()[0] for i in range(x.shape[0])])
and for the following tensor:
torch.Size([40, 382, 26])
tensor([[[ 1.2496e+00, -2.5842e-03, 1.7675e-03, ..., 4.5889e-01,
-7.1389e-01, 1.6415e+00],
[ 1.2491e+00, -1.3931e-04, 1.8480e-03, ..., -2.6708e-01,
-2.3991e-01, -3.1352e-01],
[ 1.2478e+00, -3.3568e-03, -3.4667e-03, ..., -2.5959e-01,
-8.3522e-01, 1.6146e+00],
...,
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00]],
[[ 1.2491e+00, 6.6564e-06, -1.0897e-04, ..., 4.2065e-01,
-8.5722e-01, 1.4956e+00],
[ 1.2487e+00, 3.3545e-03, 1.0616e-03, ..., -1.7322e-01,
-5.1711e-01, 7.0258e-01],
[ 1.2473e+00, 2.1691e-03, -3.5784e-03, ..., -3.0112e-01,
-9.7947e-01, 1.5638e+00],
...,
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00]],
[[ 1.2494e+00, 1.0986e-03, -1.0312e-03, ..., 3.7550e-01,
-3.3405e-02, 1.0006e+00],
[ 1.2489e+00, 4.7714e-03, 2.5151e-03, ..., -6.2233e-01,
-1.9066e-01, 5.2548e-01],
[ 1.2476e+00, 3.6464e-03, 1.2658e-04, ..., -6.5587e-01,
-1.0196e+00, 3.9814e-01],
...,
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00]],
...,
[[ 1.2471e+00, -3.2874e-03, 2.5414e-03, ..., 2.1677e-01,
-4.8340e-01, 1.4457e-02],
[ 1.2466e+00, -6.1289e-03, -3.8176e-03, ..., 8.6825e-01,
-9.4528e-01, 1.5469e+00],
[ 1.2452e+00, -8.7639e-03, -1.0271e-02, ..., 1.7150e-01,
-8.5414e-02, -4.8455e-01],
...,
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00]],
[[ 1.2480e+00, 2.3856e-04, -4.2868e-03, ..., -1.7246e-01,
-9.1360e-01, -4.5913e-01],
[ 1.2476e+00, -3.6790e-03, -1.0436e-02, ..., -6.8468e-02,
1.4334e-01, 8.2367e-01],
[ 1.2463e+00, -7.7002e-03, -1.6612e-02, ..., -2.4162e-01,
-3.2239e-01, -1.1522e-01],
...,
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00]],
[[ 1.2536e+00, 2.7262e-03, -3.5257e-04, ..., -3.2121e-01,
-1.6265e-01, -4.9548e-01],
[ 1.2532e+00, -2.0610e-03, -6.2969e-03, ..., 2.4315e-01,
1.0951e-02, 1.4688e+00],
[ 1.2520e+00, -6.5647e-03, -1.2356e-02, ..., -2.0678e-01,
8.6351e-02, -5.9951e-01],
...,
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,
0.0000e+00, 0.0000e+00]]], device='cuda:0', grad_fn=<CatBackward0>)
I got this error:
--> 373 length = torch.LongTensor([(x[i,:,0] == 0).nonzero()[0] for i in range(x.shape[0])])
RuntimeError: numel: integer multiplication overflow
I also observed that when I add print(x)
inside the forward module I get this error just for priniting while outside I don’t get the error:
RuntimeError: CUDA error: device-side assert triggered
The result of another experiment with my model is that I moved the above line which was causing the error outside of the forward
module. The error disappered. Then I added sequence length as input for the forward
module, but instead I got this error:
380 x, length, batch_first=True, enforce_sorted=False
381 )
--> 382 out_packed, (_, _) = self.rnn(packed, (h0, c0))
383 y, _ = nn.utils.rnn.pad_packed_sequence(out_packed, batch_first=True)
384 y = self.dropout(y)
/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.8/dist-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
775 self.dropout, self.training, self.bidirectional, self.batch_first)
776 else:
--> 777 result = _VF.lstm(input, batch_sizes, hx, self._flat_weights, self.bias,
778 self.num_layers, self.dropout, self.training, self.bidirectional)
779 output = result[0]
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
`
The batch size is equal to 5 right now.
What is the reason for this error?