CNN Produces same output for every Image after initialization

Malte · April 12, 2022, 3:54pm

I’ve implemented a version off the VGG16-Net for regression (so it predicts one output value) and noticed that after random initialization of the network produces the same output value for every image. During training this starts to change as the loss decreases. I wonder if this behavior is due to the random initialization or if there is actually something wrong with my network?

eqy · April 12, 2022, 5:57pm

Is this happening after a few training steps (e.g., due to bias in the dataset) or immediately after initialization? I wouldn’t expect this to be the case after initialization:

# cat vggtest.py
import torch
import torchvision

model = torchvision.models.vgg16(init_weights=True, num_classes=1)
model = model.cuda()

for i in range(10):
    inp = torch.randn(1, 3, 224, 224, device='cuda')
    out = model(inp)
    print(out)

# python3 vggtest.py
tensor([[0.2661]], device='cuda:0', grad_fn=<AddmmBackward0>)
tensor([[0.1230]], device='cuda:0', grad_fn=<AddmmBackward0>)
tensor([[0.1168]], device='cuda:0', grad_fn=<AddmmBackward0>)
tensor([[-0.0185]], device='cuda:0', grad_fn=<AddmmBackward0>)
tensor([[-0.0407]], device='cuda:0', grad_fn=<AddmmBackward0>)
tensor([[-0.0701]], device='cuda:0', grad_fn=<AddmmBackward0>)
tensor([[0.0390]], device='cuda:0', grad_fn=<AddmmBackward0>)
tensor([[0.0971]], device='cuda:0', grad_fn=<AddmmBackward0>)
tensor([[0.2076]], device='cuda:0', grad_fn=<AddmmBackward0>)
tensor([[0.1138]], device='cuda:0', grad_fn=<AddmmBackward0>)