I’m building a simple network that takes in two numbers and learns how to add them.

```
import torch
add1= torch.randint(0,9,size=[6000])
add2= torch.randint(0,9,size=[6000])
add_sum = add1 + add2
```

This a pretty simple network

```
from torch import nn
from torch.nn import functional as F
class Net(nn.Module):
def __init__(self):
super().__init__()
self.linear1 = nn.Linear(2,20)
self.linear2 = nn.Linear(20,1)
def forward(self,x1,x2):
inp = torch.cat((x1[None],x2[None])).float()
out = self.linear1(inp)
out = F.relu(out)
out = self.linear2(out)
return out
```

Here’s the training loop

```
net = Net()
optim = torch.optim.AdamW(net.parameters(),lr=0.1)
criterion = nn.MSELoss()
for i in range(len(add1)):
out = net(add1[i],add2[i])
loss = criterion(out,add_sum[i].float())
optim.zero_grad()
loss.backward()
optim.step()
if i%500==0: print(loss)
```

- Now the input here doesn’t have batch size, yet it works. But sometimes pytorch inference doesn’t work without a batch size. Why is that?
- If I’m trying to input a vector of n features to an NN that starts with a linear layer

should be the shape (n,) or (n,1) or (1,n) ? Pretty confused about that here. - Is the way I’m handling the input the right way? Or is there a better way to do it?