Unable to adjust the batch_size please help me out

Pata_Naheen · December 28, 2018, 4:53pm

class RNN(nn.Module):
    def __init__(self):
        super(RNN, self).__init__()

        self.rnn = nn.LSTM(
            input_size=6,
            hidden_size=6,
            num_layers=2,
            batch_first=True,
        )

    def forward(self, x):
        out, (h_n, h_c) = self.rnn(x, None)
        return out[:, -1, :]    # Return output at last time-step

X = torch.FloatTensor(X)
y = torch.LongTensor(Y)

rnn = RNN()
optimizer = torch.optim.Adam(rnn.parameters(), lr=0.001)
loss_func = nn.CrossEntropyLoss()     


for j in range(500):
    for i, item in enumerate(X):
        item = item.unsqueeze(0)
        print(item)
        print(item.shape)
        output = rnn(item)
        target = y[i]
        print(target)
        target = target.squeeze_()
        print(output,target)
        loss = loss_func(output, target)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

ValueError: Expected input batch_size (1) to match target batch_size (2).

output after forward = tensor([[0.1039, 0.0309, 0.1265, 0.1670, 0.0287, 0.1056]], grad_fn=)

target = tensor([0, 0])

vmirly1 · December 28, 2018, 5:07pm

What is the shape of X and Y?

Pata_Naheen · December 28, 2018, 5:10pm

input shape: torch.Size([1, 6])
output shape: torch.Size([2])

Hong · December 28, 2018, 5:21pm

Can you show your complete code including your input?

Pata_Naheen · December 28, 2018, 5:27pm

class RNN(nn.Module):
    def __init__(self):
        super(RNN, self).__init__()

        self.rnn = nn.LSTM(
            input_size=6,
            hidden_size=6,
            num_layers=2,
            batch_first=True,
        )
        self.fc = nn.Linear(6,2)

    def forward(self, x):
        out, (h_n, h_c) = self.rnn(x, None)
        return out[:, -1, :]    # Return output at last time-step

X = torch.FloatTensor(X)
y = torch.LongTensor(Y)

rnn = RNN()
optimizer = torch.optim.Adam(rnn.parameters(), lr=0.001)
loss_func = nn.CrossEntropyLoss()     


for j in range(500):
    for i, item in enumerate(X):
        item = item.unsqueeze(0)
        output = rnn(item)
        target = y[i]
        #print(target)
        #target = target.squeeze_()
        #print("input shape:",output.shape, "output shape:", target.shape)
        loss = loss_func(output, target.argmax(dim=1))
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

this is the complete code

and x = (1000,1,6) and Y = (1000,1,2)

Hong · December 28, 2018, 6:19pm

You code works on my machine

vmirly1 · December 28, 2018, 6:50pm

I think the output shape is missing a dimension for the batch. Can you reshape it to have shape (1, 2)?

vmirly1 · December 28, 2018, 9:34pm

Since Y is of shape (1000,1,2), then indexing this tensor by target = Y[i] will make target to have shape (1, 2). But the input to the loss function defined as torch.nn.CrossEntropyLoss needs input to have shape (N,C) and target to be of shape (N). (N is the batch-size)

So, I think you should reshape both target and the output of your model accordingly. So, the target should have shape (1), and output should have shape (1, 2). But I think you are using one-hot vectors for the target, so that needs to be changed to be a tensor of class labels from {0, 1},

Pata_Naheen · December 29, 2018, 4:35am

you change something or not

Pata_Naheen · December 29, 2018, 4:35am

i am confused what i do

vmirly1 · December 29, 2018, 3:23pm

So, given that the shape of output is (2) and the target is (1,2), then I think the following changes may solve the issue:

        output = rnn(item)
        target = y[i]
        # reshape the output:
        output = output.reshape(-1, 2)
        # get elements in the second column of target
        target = target[:,1]
        
        # now compute the loss
        loss = loss_func(output, target.argmax(dim=1))

Hong · December 30, 2018, 5:07am

I only changed
X = torch.FloatTensor(X)
y = torch.LongTensor(Y)
to a some real tensors:
X = torch.randn(10, 1, 6)
y = torch.randn(10, 1, 2)

And everything works.
I changed 1000 to 10 just to make it end sooner.

Pata_Naheen · December 30, 2018, 1:25pm

can you share me complete code what are you running because with this simple changes in my machine its not running

Hong · December 31, 2018, 7:20am

import torch
import torch.nn as nn


class RNN(nn.Module):
    def __init__(self):
        super(RNN, self).__init__()

        self.rnn = nn.LSTM(
            input_size=6,
            hidden_size=6,
            num_layers=2,
            batch_first=True,
        )
        self.fc = nn.Linear(6, 2)

    def forward(self, x):
        out, (h_n, h_c) = self.rnn(x, None)
        return out[:, -1, :]    # Return output at last time-step


X = torch.randn(10, 1, 6)
y = torch.randn(10, 1, 2)

rnn = RNN()
optimizer = torch.optim.Adam(rnn.parameters(), lr=0.001)
loss_func = nn.CrossEntropyLoss()


for j in range(2):
    for i, item in enumerate(X):
        item = item.unsqueeze(0)
        output = rnn(item)
        target = y[i]
        # print(target)
        #target = target.squeeze_()
        #print("input shape:",output.shape, "output shape:", target.shape)
        loss = loss_func(output, target.argmax(dim=1))
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

Hope it helps. But still, you code doesn’t make much sense to me. First, you input shape is (1000, 1, 6), in which you are saying that your whole batch size is 1000, sequence length is 1, and feature dimension is 6. If you are using RNN, why do you have sequence length 1? Second, You label’s shape is (batch_size, sequence_len, num_of_classes). This is weird. I think you want one label for each sequence, then why is sequence_len involved? Should it be (batch_size, num_of_classes)? Third, in your RNN network, you defined fc linear layer in init(), but you forgot to call it in forward()