How could I use minibatch

Help me understand how minibatch works.

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(50, 1, kernel_size=(1, 32), stride=(1, 1))
        self.bn1 = nn.BatchNorm2d(50, affine=True)
        
        self.fc1 = nn.Linear(50,64, 32)
        self.bn2 = nn.BatchNorm2d(50, affine=True)
        
        self.fc2 = nn.Linear(50,32, 16)
        self.bn3 = nn.BatchNorm2d(50, affine=True)
        
        self.fc3 = nn.Linear(50,16, 1)
        self.tan = nn.Hardtanh()
    
    def forward(self, x):
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.fc1(out)
        out = self.bn2(out)
        out = self.fc2(out)
        out = self.bn3(out)
        out = self.fc3(out)
        return self.tan(out)
net = Net()
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)
st_1 = np.interp(np.loadtxt('1d.txt',delimiter=';'), [0,10], [-1,1])
st_2 = np.loadtxt('target_1d.txt',delimiter=';')
st_3 = np.loadtxt('My1d.txt',delimiter=';')
len_st = len(st_2)
batch = 50
b = (len_st-wn1)//batch
len_batch = b*batch

I create a minibatch on the fly in a loop.

for epoch in range(epochs):
    for wn_start in range(0,len_batch,batch): # step - batch
        wn_tick = wn_start + wn1
        wn_all = []
        los_l = []
        for b_iter in range(batch): # create minibatch
            wn_all = wn_all + [st_1[wn_start+b_iter:wn_tick+b_iter,:]]
            los_l = los_l + [st_2[wn_tick-1]]
        wn_all = torch.as_tensor(wn_all, dtype=torch.float32)
        wn_all = wn_all.unsqueeze(0)
        wn_all = torch.transpose(wn_all,2,3) #([1, 50, 32, 64]) -> ([1, 50, 64, 32])
        wn_all = torch.transpose(wn_all,0,1) #([1, 50, 32, 64]) -> ([50, 1, 64, 32])
        los_l = torch.Tensor([los_l]).unsqueeze(0).unsqueeze(0)
        los_l = torch.transpose(los_l,0,3) #([1, 1, 50, 1]) -> ([50, 1, 1, 1])

        outputs = net(wn_all)

        loss1 = criterion(los_l, outputs[0,0,0,0])
        optimizer.zero_grad() #обнуление градиента
        loss1.backward()
        optimizer.step()

Look at my code, it’s probably not perfect)), but it still doesn’t work.
My code worked when I did not use batch, but the training went on for a very long time, as there is a lot of data. I decided to use minibatch to wrap data into the network and process them in parallel.

My question is probably very stupid and simple for you, but I can’t figure out how to use minibatch.
Can you at least give me the simplest example. I do not understand how minibatch to push through the net.

Most likely you are accidentally broadcasting the output by slicing it:

loss1 = criterion(los_l, outputs[0,0,0,0])

I assume outputs is a tensor of shape [50, 1, 1, 1].
If that’s the case, just pass it to your criterion, as the target should already have the same shape:

loss = criterion(outputs, los_l)

It seems to me, but I could be mistaken that the problem is that Conv2d accepts 4D vector, and Linear accepts 2D. I can not transfer from Conv2d batch to Linear. Or am I again confused)))

I have to submit a 64x32 matrix to the network input, to get 1x64 at the output. I do nn.Linear (32,1) -> (64,1) .t () = (1,64)
It works well. But if I want to use batch = 50, I enter 50x64x32 at the entrance. How do I get 50x64 output?

I might have overlooked some issues.
You are currently initializing the linear layer as:

self.fc1 = nn.Linear(50,64, 32)

which will use in_features=50, out_features=64 and set bias=64, which will result in bias=True.
You don’t have to set the batch size in the layers, as it will be automatically used as the first dimension of your input.

Also, nn.BatchNorm2d should get the number of output channels of the preceding conv layer, which would be 1 for bn1.

Let’s look at a simple example, not paying attention to my first post.
I have input input.torch.randn (5,20).
At the exit I have to get (5,1). I applied nn.Linear (20,1).
Now I need to add the third dimension (batch = 10).
My input is input.torch.randn (10,5,20).
Please write me how the Linear should look like to get at the output (10,5)

m = nn.Linear(20,1)
input = torch.randn(10, 5, 20)
output = m(input)

output = torch.transpose(output,1,2)
output = torch.transpose(output,0,1)
output=output.contiguous().view(10,5)

Is it right to do so?

A simple use case would be:

batch_size = 10
in_features = 20
out_features = 1

lin = nn.Linear(
    in_features=in_features,
    out_features=out_features
)

x = torch.randn(batch_size, in_features)
out = lin(x)
print(out.shape)  # [batch_size, out_features]

The input is specified as [batch_size, in_features], so in your first example, you would use a batch of 5 samples, each containing 20 features.

The second example is a bit more complicated.
dim1 in this case refers to “additional” dimensions, which can be seen as applying the linear layer in a loop for each batch:


x = torch.randn(batch_size, 5, in_features)
out = lin(x)
out1 = out.clone()

out2 = []
for idx in range(5):
    out = lin(x[:, idx])
    out2.append(out.clone())
out2 = torch.stack(out2, dim=1)

# Compare both outputs
print(torch.allclose(out1, out2))
> True
1 Like