Confused about NN input

smu226 · May 24, 2020, 12:47am

Hello! This is probably something silly but I am confused about why the input to my NN doesn’t work. This is the network.

class SimpleNet(nn.Module):
    def __init__(self, ni):
        super().__init__()
        self.linear1 = nn.Linear(ni, 128)
        self.bn1 = nn.BatchNorm1d(128)
        self.linear2 = nn.Linear(128, 128)
        self.bn2 = nn.BatchNorm1d(128)
        self.linear3 = nn.Linear(128, 64)
        self.bn3 = nn.BatchNorm1d(64)
        self.linear4 = nn.Linear(64,64)
        self.bn4 = nn.BatchNorm1d(64)
        self.linear5 = nn.Linear(64,1)

    def forward(self, x):
        x = F.tanh(self.bn1(self.linear1(x)))
        x = F.tanh(self.bn2(self.linear2(x)))
        x = F.tanh(self.bn3(self.linear3(x)))
        x = F.tanh(self.bn4(self.linear4(x)))
        x = self.linear5(x)
        return x

n_variables = 2
model = SimpleNet(n_variables).cuda()

If I run

dt = torch.tensor([[1, 2],[3, 4]]).float().cuda()
model(dt)

it works fine. But for my case I need to pass one input at a time (after training) and when I do this:

dt = torch.tensor([[1, 2]]).float().cuda()
model(dt)

I am getting this error: Expected more than 1 value per channel when training, got input size torch.Size([1, 128])

What am I doing wrong? Thank you!

ptrblck · May 24, 2020, 4:05am

The batchnorm layers cannot calculate the running estimates from a single sample, and raise this issue.
Your input activation would have the shape [1, 128] without a temporal dimension, where 128 are the input channels.

You could either disable the batchnorm layers via model.eval() or increase the batch size.
Also, using batchnorm layers in general for a small batch size might yield skewed running estimates.