# Why do some inputs need an axis for batch size but some don't?

I’m building a simple network that takes in two numbers and learns how to add them.

``````import torch

``````

This a pretty simple network

``````from torch import nn
from torch.nn import functional as F

class Net(nn.Module):

def __init__(self):
super().__init__()
self.linear1 = nn.Linear(2,20)
self.linear2 = nn.Linear(20,1)

def forward(self,x1,x2):
inp = torch.cat((x1[None],x2[None])).float()
out = self.linear1(inp)
out = F.relu(out)
out = self.linear2(out)

return out
``````

Here’s the training loop

``````net = Net()
criterion = nn.MSELoss()

loss.backward()
optim.step()

if i%500==0: print(loss)
``````
1. Now the input here doesn’t have batch size, yet it works. But sometimes pytorch inference doesn’t work without a batch size. Why is that?
2. If I’m trying to input a vector of n features to an NN that starts with a linear layer
should be the shape (n,) or (n,1) or (1,n) ? Pretty confused about that here.
3. Is the way I’m handling the input the right way? Or is there a better way to do it?

Essentially a broader question would be, is there a guide on the shapes and types the tensors have to be for different models and loss functions?

Yes, the docs mention the expected shape for each layer and I would stick it to it. My general rule is that a batch dimension is expected in `nn.Module`s. While e.g. linear layers could work with an input having a single dimension, you would have to verify what’s applied internally (which dimension is broadcasted etc.), so I would prefer to use the documented approach.

1 Like