Save output image of CNN model

Hi, I made a CNN model for edge detection task. It worked fine and I can see good results through the Feature maps. the problem is I’m using 4 filters in each convolutional layer, so the output is going to be an image with 4 channels but I need the image to be saved as a 3 channels to work with it later, here an example of the code I used:

#Feed image to model:
output = model(my_img) #input size (1, 3, 224, 224)    
#convert output tensor to numpy
fimg = output.detach().cpu().numpy()   
#get rid of the single batch dimension 
fimg = fimg.squeeze(0)
#swap axes to (13, 13, 4)
fimg = fimg.transpose(1, 2, 0)

it’s gonna be saved as (13,13,4) but I need it as (13, 13, 3). If anyone could help it would be appreciated.

I’m not well versed in CNN models for edge detection, but perhaps you could look at deconvolution layers using torch.nn.ConvTranspose2d?
Note, technically it’s not a deconvolution operation, but it can be used when you want to ‘decode’ after ‘encoding’, and so could be useful for your image-to-image task.

https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html

I don’t quite understand your question. Do you need the 4th channel later on? Or do you want to combine it with rest of the channels before saving it?
If not, just use the first 3 channels and plot them. You can also save the 4th channel as a greyscale image separately.

@Sanjan_Das ok I will try to make it work, but is it possible to add a ConvTranspose2d layer in a CNN model ? if you can provide an example so I can better understand sir

@Shima_Shahfar yes I want to combine the 4th channel with the rest so the final image will have 3 channels

@Fahd_Jerbi have a look at U-Nets that are popular for image segmentation tasks. They make use of an ‘encoder’ half and a ‘decoder’ half that allows you to have model outputs the same size as your input.

Here’s a link you might like to go through if you want to explore it further: https://towardsdatascience.com/creating-and-training-a-u-net-model-with-pytorch-for-2d-3d-semantic-segmentation-model-building-6ab09d6a0862

It is possible but does your loss function works the same way for all the channels? Or do you have any specific loss for the 4th channel? I mean don’t you need the edge map somewhere in your calculation?
If it’s the same for all channels then I think adding a conv2d layer should solve your problem.

How adding another conv2d will help in this situation ? I’m already using three con2d layers in my model, but the problem here is with reconstructing the image for external use. You said earlier it’s possible, can you provide a link or an example ?

It is possible but I am not sure if it’s the best way to go for your problem.
From what I understand you only want to reconstruct the RGB image from the output, am I right? If yes, do you know what each channel of your output represents? Isn’t one of the channels the edge map?

In case you want to do that, you can either change the output shape of your 3rd conv2d or add another layer with the input channel of your last layer and your desired output dimension. But you may need to adjust your lost function as well depending on what loss function you are using.

@Shima_Shahfar Thank you so much it worked fine for the problem