My model losses always have mean 1

Hi everyone, I have an autoencoder model that I implemented using pytorch, and I noticed something strange. It was working too well without training. The model is as follows:

class ConvAutoencoder(nn.Module):
def __init__(self):
    super(ConvAutoencoder, self).__init__()

     # Encoder
    self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding=1)
    self.conv2 = nn.Conv2d(in_channels=32, out_channels=128, kernel_size=3, padding=1)
    self.pool = nn.MaxPool2d(kernel_size=1, stride=3)  # reduces height and width /2

    # Decoder
    self.t_conv1 = nn.ConvTranspose2d(in_channels=128, out_channels=64, kernel_size=(2,3), stride=(1,3))
    self.t_conv2 = nn.ConvTranspose2d(in_channels=64, out_channels=1, kernel_size=2, stride=(2, 2))

def forward(self, x):
    x = F.relu(self.conv1(x))
    x = self.pool(x)
    x = F.relu(self.conv2(x))
    x = self.pool(x)
    x = F.relu(self.t_conv1(x))
    x = self.t_conv2(x)

In my case my problem is anomaly detection, I have a dataset with the following form:

var1,var2,var3,var4,anomaly
-2.303138056500457,-6.356406683755182,4.718265100779811,-3.803123770009389,0
-0.6014388028983485,1.4546218686634245,3.803742475994967,5.437633496931176,1

If the autoencoder detects very high losses, the sample is considered an anomaly. The thing is that with the model with all neurons with weight 0, I understand that the loss should be quite random. However, it gives high losses right in the anomalous samples, which makes it get its anomaly detection task right, without having trained.

The code where the losses are calculated is as follows:

    with torch.no_grad():
        optimizer.zero_grad()
        for images in test_matrix_array:
            images = images[0].to(device)
            model = ConvAutoencoder.ConvAutoencoder().to()
            model.apply(weights_init)
            outputs = model(images)
            loss = criterion(outputs, images)
            losses.append(loss.item())
            losses_index.append([data, loss])

For example, if I test with 3 samples and one is anomalous and two are not, I am getting : loss: tensor(0.8815) loss: tensor(0.9553) loss: tensor(1.1993) . The one with higher loss is the anomalous one, and this happens with all the anomalys.So, they are getting detected as anomalys, because I have a threshold that determines which ones are anomalous, and this threshold is calculated on the average loss. However this samples always have a mean of 1. In the case before, (0.88 + 0.95 + 1.20)/3 1+

PD:

I think the problem lies in:

            outputs = model(images)

it is not being deterministic and for the same input in model, with an equal model (all neurons at 0) it gives me different output values.

A question for clarification: when all weights are 0, does the model output same prediction for all inputs? Also, is the dataset a dataset of images or something shown in the example? If dataset is like the latter case, probably model should be updated to use linear features of data rather than images.

No, it makes different predictions, even for the same input and model. The dataset can be found at share_dataset/anomalies.csv at main · pablogarciastc/share_dataset · GitHub . Is not an image dataset, but a tabular data dataset. The number of variables is the width and the number of samples is the number of rows, thus making matrices, which is what is introduced to the model.

I think there is a mismatch here between the model and the dataset: the dataset is an anomaly dataset while the model architecture is an architecture for images since it uses convolution layers.

Maybe a model like this will help:

class AutoEncoder(nn.Module):
    """A simple autoencoder"""
    def __init__(self, input_size, hidden_size):
        super(AutoEncoder, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.encoder = nn.Linear(input_size, hidden_size)
        self.decoder = nn.Linear(hidden_size, input_size)

    def forward(self, x):
        x = self.encoder(x)
        x = F.relu(x)
        x = self.decoder(x)
        return x

Make sure that the output label is not passed as an input variable.