Hi everyone, I have an autoencoder model that I implemented using pytorch, and I noticed something strange. It was working too well without training. The model is as follows:
class ConvAutoencoder(nn.Module): def __init__(self): super(ConvAutoencoder, self).__init__() # Encoder self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding=1) self.conv2 = nn.Conv2d(in_channels=32, out_channels=128, kernel_size=3, padding=1) self.pool = nn.MaxPool2d(kernel_size=1, stride=3) # reduces height and width /2 # Decoder self.t_conv1 = nn.ConvTranspose2d(in_channels=128, out_channels=64, kernel_size=(2,3), stride=(1,3)) self.t_conv2 = nn.ConvTranspose2d(in_channels=64, out_channels=1, kernel_size=2, stride=(2, 2)) def forward(self, x): x = F.relu(self.conv1(x)) x = self.pool(x) x = F.relu(self.conv2(x)) x = self.pool(x) x = F.relu(self.t_conv1(x)) x = self.t_conv2(x)
In my case my problem is anomaly detection, I have a dataset with the following form:
var1,var2,var3,var4,anomaly -2.303138056500457,-6.356406683755182,4.718265100779811,-3.803123770009389,0 -0.6014388028983485,1.4546218686634245,3.803742475994967,5.437633496931176,1
If the autoencoder detects very high losses, the sample is considered an anomaly. The thing is that with the model with all neurons with weight 0, I understand that the loss should be quite random. However, it gives high losses right in the anomalous samples, which makes it get its anomaly detection task right, without having trained.
The code where the losses are calculated is as follows:
with torch.no_grad(): optimizer.zero_grad() for images in test_matrix_array: images = images.to(device) model = ConvAutoencoder.ConvAutoencoder().to() model.apply(weights_init) outputs = model(images) loss = criterion(outputs, images) losses.append(loss.item()) losses_index.append([data, loss])
For example, if I test with 3 samples and one is anomalous and two are not, I am getting : loss: tensor(0.8815) loss: tensor(0.9553) loss: tensor(1.1993) . The one with higher loss is the anomalous one, and this happens with all the anomalys.So, they are getting detected as anomalys, because I have a threshold that determines which ones are anomalous, and this threshold is calculated on the average loss. However this samples always have a mean of 1. In the case before, (0.88 + 0.95 + 1.20)/3 ≈ 1+
I think the problem lies in:
outputs = model(images)
it is not being deterministic and for the same input in model, with an equal model (all neurons at 0) it gives me different output values.