I’ve implemented a seq2seq model with a LSTM, and although it runs well on CPU, on GPU I get the following error:

```
...
assert hx.is_contiguous()
AssertionError
```

After some bug hunting, I found out that this happens because of the initial hidden state of the encoder. My code for the initial state does the following:

```
hidden = (init_hidden[0].expand(1, batch_size, init_hidden[0].size(0)),
init_hidden[1].expand(1, batch_size, init_hidden[1].size(0)))
```

The expansion needs to happen because the nn.LSTM expects a tensor of this size. hidden is a parameter list containing 2 vectors (h0 and c0):

```
init_hidden = nn.ParameterList([
nn.Parameter(torch.zeros(hidden_size)) for _ in range(2)])
```

Now, I know that by doing .contiguous() on each element of the tuple, I solve the problem, but this happens at the expense of a lot of unnecessary memory and computation time. After all the initialization is always the same for each element of the batch, and what I’ve done works well in CPU.

Is there a better way to do this? And why this difference between CPU and GPU? And why is the result of an expand not contiguous? After all, no memory is allocated, it is just a view of the tensor…