Hi K. Frank,
Thanks for your suggestions. What I mean by “bottleneck” is, that regression with 512 units as outputs is a more difficult task than classification, especially since you are not supressing your output in any case (at least that’s what I think, correct me if I’m wrong). I tried creating a custom loss as torch.mean(torch.fmod(output - target, 2 * np.pi) ** 2), but it didn’t seem to converge to a loss lower than pi / 2. And considering that due to torch.fmod() the loss is always going to be less than 2 * pi, then a loss of pi / 2 is still huge. Still, I totally get the thing you said on penalizing the model on neighboring predictions and you’re right. Maybe I could try what has been used for ordinal regression? Never read anything about this, but I know that it exists. You can check the thread I opened here, and maybe you can understand my problem better: Transfer learning using VGG-16 (or 19) for regression
Thanks