What is loss.item()

we iterate through the entire images and labels , per epoch

for images, labels in trainloader:

why we mixed the losses (losses of different images )! ?


and calculated all together ?!
sorry , i’m new , and i want to make my final project with pytorch

The average of the batch losses will give you an estimate of the “epoch loss” during training.
Since you are calculating the loss anyway, you could just sum it and calculate the mean after the epoch finishes.
This training loss is used to see, how well your model performs on the training dataset.

Alternatively you could also plot the batch loss values, but this is usually not necessary and will give you a lot of outputs.


thanks alot , one more question , do you think google colab GPU is fit to a project of facial recognition in real time

I’ve just used Colab in the past a few times for debugging purposes. Not sure if the runtime changed, but I don’t thing Colab is a good fit for “real time” deployment, e.g. since the notebook runtime is limited.

1 Like

so what is your suggestion ?

You could use some cloud deployment service like Microsoft Azure, GCP, AWS or others.

1 Like

thank you for your help :blush:

Hi, on the topic of deployment, can you guide me to any tutorial, documentation, etc on using Azure DevOps (not Azure MLOps) for Pytorch deployment?

I’m unfortunately not experienced in deployment with Azure DevOps. :confused:

1 Like

How is that different from the .data field?

The .data attribute shouldn’t be used, as it might yield unwanted side effects.
The right way to get a Python scalar is via .item().

1 Like

Do you have a link discussing why (and what) there might be “unwanted side effects” more precisely?


@albanD lists some aspects in this post.

1 Like

What if we use .detach() when working with Pytorch lightning. It would give us the data without any computational graph. Will it be correct to use .detach() instead of .item() ?

.detach() will return a tensor, which is detached from the computation graph, while .item() will return the Python scalar. I don’t know how and where this is needed in PyTorch Lightning depending on the use case detach() might also work.

Thanks for such a quick reply. Yes, I understand this concept now. So because in Pytorch docs it is mentioned that tensors can be passed for logging, I am sure I can do the following:

self.log('val_loss', loss.detach(), prog_bar=False, on_step=False, on_epoch=True, logger=True)

However, I am little unsure if the following implementation is correct in which I replaced .item() with .detach() before the loss value is returned by the model. I am not getting any syntax error though but I am little worried if it would interfere with grad_calculation and affect the performance.

# Loss : Mask outputs to ignore non-existing objects (except with conf. loss)
            loss_x = self.mse_loss(x[obj_mask], tx[obj_mask])
            loss_y = self.mse_loss(y[obj_mask], ty[obj_mask]) ....

            self.metrics = {
                "loss": to_cpu(total_loss).detach(),
                "x": to_cpu(loss_x).detach(),
                "y": to_cpu(loss_y).detach(), ..... }

            return output, total_loss

NOTE - The reason I am trying to replace .item() is because I am training the model on multiple gpus and it was taking very long to train just one epoch. So while going through the pytorch lightning docs I came across this

Don’t call .item() anywhere in your code. Use .detach() instead to remove the connected graph calls. Lightning takes a great deal of care to be optimized for this. https://pytorch-lightning.readthedocs.io/en/stable/performance.html

1 Like

Calls into item() might slow down your code, as they are synchronizing.
While detach() could potentially avoid synchronizations, a push to the CPU would still wait for the GPU to finish the calculation and would thus synchronize, so I don’t think your self.metrics would behave differently in this case.
However, let me know if you see any changes.

1 Like

Yes, you were correct. I could only manage to reduce time a little by using more num_workers in dataloader and using 2 gpus. But still it is taking around 3.5 hrs for each epoch for 50.000 images, which I feel is still a lot.

Form what I understand, I don’t need to specify the device for any tensor as pytorch lightning takes care of it. So would it be okay if I replace to_cpu(x).item() with just x.detach() ?

You can try it out, but I assume the implementation in Lightning might be there for a reason.
E.g. if you need to print these values after returning them, you would still need to synchronize the code, since these values have to be calculated first.

Okay I would give it a try. Thanks