Different rmse values for same evaluation but different batch size

Hi everyone! I am evaluating my network and I realize that my RMSE metric is not always the same. It depends of the batch_size.
Here is my code.:

            data_loader = DataLoader(....., batch_size, True, workers, stage="test")
            rmse = []
            for batch_idx, sample in enumerate(data_loader):
                sample['color'] = sample['color'].cuda()
                sample['gt'] = sample['gt'].cuda()

                mu = self.model(sample['color'])

                sample['gt'] = sample['gt'].view(-1)
                mu = mu.view(-1)
                rmse.append( compute_rsme(sample['gt'], mu))
            print(rmse.mean())

Any idea why? any help would be great =)

I guess your code broadcasts some tensors.
Could you check the shape of sample['gt'] and mu and make sure they have the same number of dimensions with the same shape?

Hi, they have the same shape. How can i check what is being broadcasting? I also checked model.eval() and with torch.no_grad() just in case.

Could you save an example tensor for sample['gt'] as well as mu and post an executable code snippet to reproduce the difference you are seeing?
Maybe you could also post a way to randomly initialize both tensors to recreate the issue.

I forgot to reply, I was averaging the rmse values but i should average the mse and then do the root square, to obtain the global rmse.