HI guys, I encounter a problem. I have a network (named as eikonalmodel in the code below), and I want the norm of the gradient of the network at input points equals some groundtruth value. Here is my code (some step involves select part of the jacobian, and it should be easy to understand):
# define loss function
loss_fun = nn.MSELoss().to(device)
# (batch, 5), which ismodel input
EIKONALFORWARDINPUT_microbatch.requires_grad_(True)
# (batch, 5)-->(batch, 1)
OUT = eikonalmodel(EIKONALFORWARDINPUT_microbatch)
# JAC shape is (batch, 1, 5). below line is my self-write code to calculate the jacobian using torch.autograd
JAC = eikonalmodel.Jacobian_by_batch(inputt_of_model = EIKONALFORWARDINPUT_microbatch, outputt_of_model = OUT.to(device), create_graph = True, retain_graph = True, allow_unused = False)
# (batch,2), just use some of the dim to calculate jac
JAC_xy = JAC[:,0,1:3].to(device)
# (batch, )
jac_norm = torch.sqrt(torch.pow(JAC_xy, 2).sum(dim = 1))
# (batch,).
groundtruth = torch.sqrt(torch.pow(EIKONALGRADIENT_microbatch, 2).sum(dim = 1))
# The groundtruth value is selected such that greater than some small positive value
loss = loss_fun(jac_norm, 1.0/groundtruth)
#
optimizer.zero_grad()
accelerator.backward(loss)
torch.nn.utils.clip_grad_norm_(parameters, args.grad_clip_value)
optimizer.step()
However, soon the gradient (or jacobian) becomes inf. I tried to decrease the learning rate, or clip the grad norm. The results are as follows:
BATCH: 0%| | 0/3086 [00:00<?, ?it/s]
microbatches: 0%| | 0/135 [00:00<?, ?it/s]e[A
loss -> 95178370.0 ; groundtruth--> 0.00020102841 ; jac_norm maxmin 0.039355658 0.00172582 ;EIKONALFORWARDINPUT_microbatch--> 0 ;OUT--> 0 ;max grad--> 0.0
microbatches: 1%| | 1/135 [00:11<26:23, 11.82s/it]e[A
loss -> 447264030.0 ; groundtruth--> 0.00018992453 ; jac_norm maxmin 0.0043626004 3.0593575e-10 ;EIKONALFORWARDINPUT_microbatch--> 0 ;OUT--> 0 ;max grad--> 31.946300506591797
microbatches: 1%|▏ | 2/135 [00:14<14:09, 6.39s/it]e[A
loss -> 189443070.0 ; groundtruth--> 0.0002351522 ; jac_norm maxmin 0.0549314 1.8008926e-05 ;EIKONALFORWARDINPUT_microbatch--> 0 ;OUT--> 0 ;max grad--> 3.616215467453003
microbatches: 2%|▏ | 3/135 [00:16<10:03, 4.57s/it]e[A
loss -> 201413380.0 ; groundtruth--> 0.000168255 ; jac_norm maxmin 0.0391026 1.2644264e-05 ;EIKONALFORWARDINPUT_microbatch--> 0 ;OUT--> 0 ;max grad--> 50.89857864379883
microbatches: 3%|▎ | 4/135 [00:19<08:05, 3.70s/it]e[A
loss -> 345555520.0 ; groundtruth--> 0.00017404387 ; jac_norm maxmin 7.554859 8.869787e-09 ;EIKONALFORWARDINPUT_microbatch--> 0 ;OUT--> 0 ;max grad--> 22.646554946899414
microbatches: 4%|▎ | 5/135 [00:21<06:59, 3.23s/it]e[A
loss -> inf ; groundtruth--> 0.0003031217 ; jac_norm maxmin inf inf