Laurick
(Laurick)
1
Hello,
I am failing to understand how to use backward or autograd 
My software keeps crashing each time when performing “auto gradients = torch::autograd::grad({ output }, { input }, { grad_output }, true);”
I also tried examples with loss.backward() only and it also crashed
// Choisir l'appareil : CPU ou GPU
torch::Device device(torch::kCPU); // Utilisez torch::kCPU pour le CPU
auto model = torch::nn::Linear(4, 3);
model->to(device);
// Génération de données d'entrée et de cible et déplacement sur l'appareil choisi
auto input = torch::randn({ 3, 4 }).to(device);
input.requires_grad_(true);
auto target = torch::randn({ 3, 3 }).to(device);
auto output = model(input);
// Calculate loss
auto loss = torch::nn::MSELoss()(output, target);
// Utilisation de la norme des gradients comme pénalité
auto grad_output = torch::ones_like(output).to(device);
// Calcul des gradients
auto gradients = torch::autograd::grad({ output }, { input }, { grad_output }, true);
auto gradient = gradients[0];
auto gradient_penalty = torch::pow((gradient.norm(2, /*dim=*/1) - 1), 2).mean();
// Add gradient penalty to loss
auto combined_loss = loss + gradient_penalty;
combined_loss.backward();
Do you know what could cause these crashes?
Laurick
Laurick
(Laurick)
2
Before crashing, the software returns the following values :
Input:
0.6896 0.0109 -0.8927 -0.6019
-1.2692 -0.3748 1.2181 1.2204
1.5659 0.6650 0.2665 0.3639
[ CUDAFloatType{3,4} ]
Target:
1.1027 1.2941 0.5516
-1.4952 -0.2794 0.0011
-0.2567 1.1541 0.6320
[ CUDAFloatType{3,3} ]
Output:
0.1708 0.4900 -0.3538
-0.1719 0.5131 -0.8069
0.8201 -0.4959 -0.9887
[ CUDAFloatType{3,3} ]
grad_output:
1 1 1
1 1 1
1 1 1
[ CUDAFloatType{3,3} ]
Could you describe what exactly is crashing and post the error message here, please?
Laurick
(Laurick)
4
Hello,
Thank you for your answer
Sure, this is the message I get :
“exception Microsoft C++ : c10::Error à l’emplacement de mémoire 0x0000009C920FD390”
This seems to point to a memory violation on the host. Could you try to create the stacktrace?
Laurick
(Laurick)
6
The versions I use are :
C++17
Visual Studio 2019
Libtorch : 2.0.0+cu118
The stacktrace recognizes everything in the code expect when calling for external code:
c10.dll!00007ffe8ce9ce9e() Inconnu
torch_cpu.dll!00007ffe5f2b1e16() Inconnu
torch_cpu.dll!00007ffe5f2b19fa() Inconnu
torch_cpu.dll!00007ffe5f2eb48a() Inconnu
torch_cpu.dll!00007ffe5cad7f34() Inconnu
torch_cpu.dll!00007ffe5c8f8b3f() Inconnu
It seems that the software can’t read c10.dll and torch_cpu.dll ; I don’t know why torch_cuda.dll is not called here
This issue seems to be related to libtorch model predict cuda convert to cpu: C10::error at memory location · Issue #73912 · pytorch/pytorch · GitHub