Hello.
I made a simple code to understand class torch.nn.LSTM in PyTorch.
I changed input’s axis from (seq_len, batch, input_size) to (batch, seq_len, input_size) when I use batch_first option.
However, I could not understand why I get different result with batch_first option.
Here is my code.
import torch
import torch.autograd as autograd
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
torch.manual_seed(1)
lstm = nn.LSTM(input_size=3, hidden_size=3, num_layers=1)
lstm2 = nn.LSTM(input_size=3, hidden_size=3, num_layers=1, batch_first=True)
inputs = autograd.Variable(torch.randn((30)))
h0 = autograd.Variable(torch.randn(1, 2, 3))
c0 = autograd.Variable(torch.randn((1, 2, 3)))
inputs1 = inputs.view(5, 2, -1).contiguous()
inputs2 = torch.transpose(inputs1, 0, 1).contiguous()
out = lstm(inputs1, (h0, c0))[0]
print("Case 1")
print(torch.transpose(inputs1, 0, 1).contiguous())
print(torch.transpose(out, 0, 1).contiguous())
print("#######"*5)
out = lstm2(inputs2, (h0, c0))[0]
print("Case 2")
print(inputs2)
print(out)
And the result below.
Case 1
Variable containing:
(0 ,.,.) =
1.4114 -0.9804 -0.7578
-0.4270 -0.3868 -0.6089
1.1848 -1.0322 -0.7039
-0.8018 -0.7855 0.7877
-0.4594 -1.1798 0.3812
(1 ,.,.) =
-0.3782 1.7211 0.0310
1.1652 -0.1326 -0.0228
0.8813 1.4276 -0.9245
0.0786 1.7053 -0.8098
-0.0064 0.5302 0.9990
[torch.FloatTensor of size 2x5x3]
Variable containing:
(0 ,.,.) =
0.0122 -0.0571 -0.0294
0.0310 -0.0025 0.2622
-0.0092 -0.1369 0.1579
-0.0573 -0.0152 0.2817
-0.1080 0.0052 0.2817
(1 ,.,.) =
0.0885 0.0426 0.3910
0.0516 -0.0430 0.2333
0.0789 0.0719 0.1446
0.1162 0.1515 0.2469
0.1018 0.0987 0.3232
[torch.FloatTensor of size 2x5x3]
###################################
Case 2
Variable containing:
(0 ,.,.) =
1.4114 -0.9804 -0.7578
-0.4270 -0.3868 -0.6089
1.1848 -1.0322 -0.7039
-0.8018 -0.7855 0.7877
-0.4594 -1.1798 0.3812
(1 ,.,.) =
-0.3782 1.7211 0.0310
1.1652 -0.1326 -0.0228
0.8813 1.4276 -0.9245
0.0786 1.7053 -0.8098
-0.0064 0.5302 0.9990
[torch.FloatTensor of size 2x5x3]
Variable containing:
(0 ,.,.) =
0.1568 -0.2322 0.0824
0.1147 -0.0394 0.1534
0.0238 -0.1611 -0.0544
0.0642 -0.0818 0.0314
0.0626 -0.1119 0.0638
(1 ,.,.) =
-0.0989 0.0636 -0.1731
-0.0746 -0.0633 -0.3005
0.0611 0.0328 -0.3500
0.1299 0.1404 -0.1330
0.1082 -0.0088 -0.0390
[torch.FloatTensor of size 2x5x3]
I really want to understand this.
Thanks in advance.