Image Detection with Coordinate as Label

sleepyo · May 25, 2021, 8:21am

Hi guys,

I’m handling a detection problem with image coordinate (x,y with range -1 to 1) as a label.
The model is something like this :

class BB_model(nn.Module):
    def __init__(self):
        super(BB_model, self).__init__()
        resnet = models.resnet34(pretrained=False)
        layers = list(resnet.children())[:8]
        self.features1 = nn.Sequential(*layers[:6])
        self.features2 = nn.Sequential(*layers[6:])
        self.bb = nn.Sequential(nn.BatchNorm1d(512), nn.Linear(512, 2))
        
    def forward(self, x):
        x = self.features1(x)
        x = self.features2(x)
        x = F.relu(x)
        x = nn.AdaptiveAvgPool2d((1,1))(x)
        x = x.view(x.shape[0], -1)
        x = self.bb(x)
        return x

With L1loss as the loss function and MAE as the metrics. Somehow the model is not learning anything…
Do I have to use tanh / sigmoid too?