FCN with patches creates boundary


I am trying to train a Unet model to do per pixel regression predictions on images. To do this, I separate my large image (1000x1000) to 200x200 pixel squares. Then use that to train an FCN model with a linear final layer. The loss function is MSE loss. In the prediction stage, I extract the same boxes but stitch it together and obtain a final output image. When I do that, the problem I am getting is that there is discontinuities between the boundary of boxes. (I can clearly see the boxes)

I’ve tried to deal with this by feeding 250x250 boxes to my FCN and calculating the loss for the 200x200 centre region. I do the same process for the prediction state. Extract 250x250 patches crop the 200x200 centre region and stitch the image back together. Please see some code below:

Loss Function:

criterion = nn.MSELoss()
optimizer = optim.Adam(self.model.parameters(), lr=LR)
for inputs, labels in train_loader:
    inputs, labels = inputs.to(device), labels.to(device)
    output = model(inputs)
    output = output.squeeze()
     _, dimx, dimy = output.shape
     loss = criterion(output[:,25:dimx-25, 25:dimy-25], labels[:,25:dimx-25, 25:dimy-25])

My code for predictions is as follows:

pred = np.zeros((height, width))
for i in range(25, height, 200):
    for j in range(25, width, 200):
        patch = img[:, i-25:i+225, j-25:j+225]
        patch = torch.from_numpy(patch)
        patch = patch.unsqueeze(dim=0).to(device)
        out = model(patch)
        out = out[0,0,25:225, 25:225]
        pred[i:i+200, j:j+200] = out.cpu().numpy()

I’m not sure if my problem makes complete sense. I can provide more clarification if necessary but I have been stuck on this for a while now.

The original UNet paper uses a similar strategy, i.e. a sliding window approach where only the center is predicted and mirrored pixel values on the border, if I’m not mistaken.

Did you see any improvement if you predicted the center of the sliding window compared to the complete patch?

I actually solved this problem. My fault for not showing my FCN model code. The problem was that I had batch normalization between the convolution layers. This created discontinuities between the patches when stitching the images back together.

Hi @Remote_Senser, do you have your FCN model code available? We are looking at using FCN (UNet perhaps) for pixel-level continuous surface prediction, but not many examples out there as most FCN applications are for the classification problem as opposed to the regression problem.

If not, what is the general approach of modifying a FCN built for pixel-level classification, to be used for pixel-level regression?