Issue with R2 Score returning strange values

Caue_Evangelista · September 15, 2023, 5:04pm

Hello, dear community!

I’ve built an AutoEncoder and would like to obtain a metric that assesses the similarity between my outputs and inputs(since using just the loss it’s not as much straightforward).

My data has dimensions [batch_size, channels, columns] = [40, 1, 22], which means the model processes 40 lines, each consisting of 22 positive values, with 1 channel.

While researching how to obtain a metric similar to accuracy for a regression task, I came across the R2 Score, which ranges from 0 to 1. However, I’m encountering some unusual results during preliminary tests, as sometimes the R2 score turns out to be negative.

I’ve read another post on the PyTorch Forum, but I’m still confused about whether I can effectively use the R2 score or if I should consider an alternative approach.

Here are my tests and the results:

# PyTorch
import torch

# Metrics
from torcheval.metrics import R2Score
from sklearn.metrics import r2_score

input1 = torch.tensor([0, 2, 1, 3])
target1 = torch.tensor([0, 1, 2, 3])

input2 = torch.Tensor(4).random_(0, 3)
target2 = torch.Tensor(4).random_(0, 3)

input3 = torch.tensor([[0, 2, 1, 3], [0, 2, 1, 3]])
target3 = torch.tensor([[0, 1, 2, 3], [0, 2, 1, 3]])

metric = R2Score()

print("--------------------------------------")
print("PYTORCH")
metric.update(input1, target1)
print("R2Score for input1 and target1: ", round(metric.compute().item(), 2))
metric.update(input2, target2)
print("R2Score for input2 and target2: ", round(metric.compute().item(), 2))
metric.update(input3, target3)
print("R2Score for input3 and target3: ", round(metric.compute().item(), 2))

metric = R2Score()
batch_size = 40
input4 = torch.Tensor(batch_size, 22).random_(0, 10)
target4 = torch.Tensor(batch_size, 22).random_(0, 10)
metric.update(input4, target4)
print("R2Score for input4 and target: ", round(metric.compute().item(), 2))

print("--------------------------------------")
print("SKLEARN")
print("R2Score for input1 and target1: ", round(r2_score(input1, target1), 2))
print("R2Score for input2 and target2: ", round(r2_score(input2, target2), 2))
print("R2Score for input3 and target3: ", round(r2_score(input3, target3), 2))
print("R2Score for input4 and target4: ", round(r2_score(input4, target4), 2))
print("--------------------------------------")

--------------------------------------
PYTORCH
R2Score for input1 and target1:  0.6
R2Score for input2 and target2:  0.47
R2Score for input3 and target3:  nan
R2Score for input4 and target:  -1.01
--------------------------------------
SKLEARN
R2Score for input1 and target1:  0.6
R2Score for input2 and target2:  -1.0
R2Score for input3 and target3:  0.5
R2Score for input4 and target4:  -1.22
--------------------------------------

I really apreciate any insights! Thank you in advance!