Artefacts when using a perceptual loss term

Hi everybody,

I have a question regarding some kind of checkerboard artefacts when using a perceptual loss function.
You can see the artefacts in the following image, these tiny white dots, it looks like the surface of a basketball.

My model:
I’m using an encoder-decoder architecture.
Downsampling is done with a nn.Conv2d() Layer with stride 2.
Upsampling is done with a nn.ConvTranspose2d() Layer with stride 2.

Loss function
First of all, these artefacts only appear when I’m using a perceptual loss term.
Using only L1 or L2 no artefacts are visible.
I’m using a VGG-19 perceptual loss term for which I’m using the output of the conv layer before the Relu activation function and the max pooling layer.

loss = vgg_loss + L1_loss

What I’ve tried
I’ve tried different weights for the content and perceptual loss without any improvement.
I’ve tried adding a total variation loss term without any improvement.
(I used this implementation for the TVLoss on the generated image: Implement total variation loss in pytorch - #2 by Xinxiang7)

I’ve also tried a Multi-level Wavelet-CNN for better up- and downsampling without any improvement.
[[1805.07071] Multi-level Wavelet-CNN for Image Restoration]

I’ve read this paper [2002.02117] Fixed smooth convolutional layer for avoiding checkerboard artifacts in CNNs but couldn’t find any code implementation. Since I don’t know how to implement the authors method I havent tried this approach. Does someone know how to implement the authors method?

Does someone know the cause of these artefacts and how to solve them?

Thank you very much,

What is the image size you are inputting to get the VGG19 features? I found that if the image size is not close to the original vgg19 network (224, 224, 3), then it produces such artifacts