CNN never converges (implementation issue suspected)

Hi all, I am having trouble getting this network to work as desired. I have tried so many iterations of this model and yet cannot get a reasonable error (it never fits, can’t even get it to overfit).

Where have I gone wrong? Any help would be greatly appreciated

For reference, there are 12 input ‘images’ (they’re actually surface elevation) of shape 49,9 and 12 labels of shape 1,9.

In case you’d like to see the data or all of my failed attempts: https://gitlab.com/jb4earth/effonn/

net = []
class Net(torch.nn.Module):
    def __init__(self, kernel_size):
        super(Net, self).__init__()
        mid_size = (49*49*9)
        self.predict = torch.nn.Sequential(
            nn.Conv2d(
                        in_channels=1,
                        out_channels=mid_size,
                        kernel_size=kernel_size,
                        stride=1,
                        padding=(0, 0)
                    ),
            nn.ReLU(),
            nn.MaxPool2d(1),
            nn.ReLU(),
            nn.Conv2d(
                        in_channels=mid_size,
                        out_channels=1,
                        kernel_size=kernel_size,
                        stride=1,
                        padding=(0, 0)
                    ),
            nn.ReLU()
        )
        

    def forward(self, x):
        x = self.predict(x)
        return x

def train_network(x,y,optimizer,loss_func):
    prediction = net(x)    
    loss = loss_func(prediction, y.squeeze())     
    optimizer.zero_grad()  
    loss.backward()     
    optimizer.step()    
    return prediction, loss


net = Net((1,1))
optimizer = torch.optim.Adam(net.parameters(), lr=0.01)
loss_func = torch.nn.MSELoss()
cnt = 0
t = True
while t == True:
    # get_xy in place of DataLoader
    (x,y) = get_xy(input_data,output_data,cnt)
    # x.shape is 1,1,49,9
    # y.shape is 1,1,1,9
    
    # train and predict
    (prediction,loss) = train_network(x,y,optimizer,loss_func)
    
    # prediction shape different than desired so averaging all results
    prediction_ = torch.mean(prediction)

    # only 12 IO's so loop through 
    cnt += 1
    if cnt > 11:
        cnt = 0

I’m not familiar with your use case, but I assume you are working on some regression problem using images.

Based on your model definition, if seems you are increasing the number of channels quite aggressively (going from 1 input channel to 21609 output channels), which is quite uncommon.

Also, nn.MaxPool2d(1) won’t have any effect, as the kernel size is set to 1, which will basically just return the same values. The second ReLU activation also won’t change anything.

I would recommend to increase the number of channels a bit smoother and also to use other base models such as VGG16 as a starter and adapt them to your use case.