Error in metric evaluation

Rexedoziem · October 30, 2022, 9:51am

found input variables with inconsistent numbers of samples: [558, 14] trying to place valid_labels and predictions into the mean squared error metric, the shape valid_labels = (558,6) and predictions = (14, 6). How do I go about it?

def MCRMSE(y_trues, y_preds):
scores = []
idxes = y_trues.shape[1]
for i in range(idxes):
y_true = y_trues[:,i]
y_pred = y_preds[:,i]
score = mean_squared_error(y_true, y_pred, squared=False) # RMSE
scores.append(score)
mcrmse_score = np.mean(scores)
return mcrmse_score, scores

def get_score(y_trues, y_preds):
mcrmse_score, scores = MCRMSE(y_trues, y_preds)
return mcrmse_score, scores

valid_labels = valid_df[config.targets].values

def eval_fx(data_loader, model, device):
model.eval()
valid_loss = 0

#final_targets = []
final_outputs = []
with torch.no_grad():
    for bi, (d, targets) in tqdm(enumerate(data_loader)):
        input_ids = d['input_ids']
        attention_mask = d['attention_mask']
        #token_type_ids = d['token_type_ids']
        targets = targets
    
        input_ids = input_ids.to(device, dtype=torch.long)
        attention_mask = attention_mask.to(device, dtype=torch.long)
        #token_type_ids = token_type_ids.to(device, dtype=torch.long)
        targets = targets.to(device, dtype=torch.float)
        # putting them into the device
        outputs = model(input_ids, attention_mask)
    
        loss = loss_fn(outputs, targets)
        #criterion = RMSELoss
        #loss = criterion(outputs,targets)
        valid_loss += loss.item() * d['input_ids'].size(0)
    valid_loss = valid_loss / len(data_loader.sampler)
        
    final_outputs.append(outputs.cpu().numpy())
    predictions = np.concatenate(final_outputs)
    
return valid_loss, predictions

import numpy as np
from tqdm import tqdm
save_model = False
best_score = np.inf
early_stopping = 5
early_stopping_counter = 0
for epoch in range(config.EPOCHS):
train_loss = train_fx(train_data_loader, model, optimizer, scheduler, device)
valid_loss, predictions = eval_fx(valid_data_loader, model, device)

score, scores = get_score(valid_labels, predictions)

Rexedoziem · October 30, 2022, 9:52am

The error came in from scores and I don’t know how to go about it?

bradmacer · November 14, 2022, 6:23am

Sounds like the shapes of your labels and predictions are not in alignment. I faced a similar problem while fitting a linear regression model . The problem in my case was, Number of rows in X was not equal to number of rows in y. In most case, x as your feature parameter and y as your predictor. But your feature parameter should not be 1D. So check the shape of x and if it is 1D, then convert it from 1D to 2D.

x.reshape(-1,1)

Also, you likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.