Artefacts when using a perceptual loss term

Hi everybody,

I have a question regarding some kind of checkerboard artefacts when using a perceptual loss function.
You can see the artefacts in the following image, these tiny white dots, it looks like the surface of a basketball.

My model:
I’m using an encoder-decoder architecture.
Downsampling is done with a nn.Conv2d() Layer with stride 2.
Upsampling is done with a nn.ConvTranspose2d() Layer with stride 2.

Loss function
First of all, these artefacts only appear when I’m using a perceptual loss term.
Using only L1 or L2 no artefacts are visible.
I’m using a VGG-19 perceptual loss term for which I’m using the output of the conv layer before the Relu activation function and the max pooling layer.

loss = vgg_loss + L1_loss

What I’ve tried
I’ve tried different weights for the content and perceptual loss without any improvement.
I’ve tried adding a total variation loss term without any improvement.
(I used this implementation for the TVLoss on the generated image: Implement total variation loss in pytorch - #2 by Xinxiang7)

I’ve also tried a Multi-level Wavelet-CNN for better up- and downsampling without any improvement.
[[1805.07071] Multi-level Wavelet-CNN for Image Restoration]

I’ve read this paper [2002.02117] Fixed smooth convolutional layer for avoiding checkerboard artifacts in CNNs but couldn’t find any code implementation. Since I don’t know how to implement the authors method I havent tried this approach. Does someone know how to implement the authors method?

Does someone know the cause of these artefacts and how to solve them?

Thank you very much,