Hey I am pretty new to PyTorch.
When I am not using detach().clone() in the following line
iou = evaluate_reconstruction(predictions_reconstruction.detach().clone(), voxels)
I am getting this error message:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 32, 32, 32]], which is output 0 of MeanBackward1, is at version 2; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True)
Even though I know how to fix this problem, I don’t really get why this happens. Could somebody explain this more in detail? Any tutorial pages, etc.?
for batch_idx, batch in tqdm(enumerate(train_dataloader)): _, renderings, class_labels, voxels = batch renderings, class_labels, voxels = renderings.to(device), class_labels.to(device), voxels.to(device) # Predict and estimate loss predictions_classification, predictions_reconstruction = model(renderings.float()) train_loss_classification = criterion_classification(predictions_classification, class_labels) train_loss_running_classification += train_loss_classification.item() if predictions_reconstruction != None: train_loss_reconstruction = criterion_reconstruction(predictions_reconstruction, voxels) train_loss_running_reconstruction += train_loss_reconstruction.item() train_loss = args.loss_coef_cls * train_loss_classification + args.loss_coef_rec * train_loss_reconstruction train_loss_running += train_loss.item() iou = evaluate_reconstruction(predictions_reconstruction.detach().clone(), voxels) train_reconstruction_iou += iou else: train_loss = train_loss_classification # Backprob and make a step optimizer.zero_grad() train_loss.backward() optimizer.step()