Hi everyone!

I work on some regression problem. I use a little bit modified VGG16 architecture (one extra conv2d at the beginning and one extra linear layer at the end). Labels and expected outputs are between <0,99>. The problem is that after every training iteration, during the validation network predicts same values for every input (e.g. output values: `[12.51, 12.51, 12.51, 12.51, 12.51, 12.51, 12.51, 12.51]`

when input labels: `[1, 15, 3, 67, 3, 66, 14, 34]`

). The values in output change with every epoch but always are same for every input.

My modified network:

VGG(

(features): Sequential(

(0): Conv2d(256, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(1): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(2): ReLU(inplace=True)

(3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(4): ReLU(inplace=True)

(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

(6): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(7): ReLU(inplace=True)

(8): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(9): ReLU(inplace=True)

(10): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

(11): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(12): ReLU(inplace=True)

(13): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(14): ReLU(inplace=True)

(15): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(16): ReLU(inplace=True)

(17): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

(18): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(19): ReLU(inplace=True)

(20): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(21): ReLU(inplace=True)

(22): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(23): ReLU(inplace=True)

(24): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

(25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(26): ReLU(inplace=True)

(27): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(28): ReLU(inplace=True)

(29): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

(30): ReLU(inplace=True)

(31): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

)

(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))

(classifier): Sequential(

(0): Linear(in_features=25088, out_features=4096, bias=True)

(1): ReLU(inplace=True)

(2): Dropout(p=0.5, inplace=False)

(3): Linear(in_features=4096, out_features=4096, bias=True)

(4): ReLU(inplace=True)

(5): Dropout(p=0.5, inplace=False)

(6): Linear(in_features=4096, out_features=1000, bias=True)

(7): Linear(in_features=1000, out_features=1, bias=True)

)

)

Any ideas?