Have you tried wrapping the update function with with torch.no_grad():?
def update(self, xyz):
with torch.no_grad():
trans = self.trans.repeat(10, 1)
new = xyz + trans
return new
...
with torch.no_grad():
mode.xyz = tsf.update(mode.xyz)
Anytime you’re performing changes to trainable parameters, it’s a good idea to do so outside of autograd tracking.
Thank you for your suggestion! I have tried the relevant methods, but they have not been effective. I just tried again and there will still be the same error.
import torch
import torch.nn as nn
pos = [0.1, 0.2, 0.3]
pos_tens = torch.Tensor(pos)
pos_pm = nn.Parameter(pos_tens, requires_grad=True)
class Transformation(nn.Module):
def __init__(self, trans=None):
super(Transformation, self).__init__()
self.trans = nn.Parameter(trans, requires_grad=True)
def update(self, xyz):
with torch.no_grad():
trans = self.trans.repeat(10, 1)
new = xyz + trans
return new
class Onemodel(nn.Module):
def __init__(self, xyz=None):
super(Onemodel, self).__init__()
self.xyz = nn.Parameter(xyz, requires_grad=True)
mode = Onemodel(pos_tens.repeat(10, 1))
tsf = Transformation(pos_tens)
with torch.no_grad():
mode.xyz = tsf.update(mode.xyz)
y = mode.xyz.sum()
y.backward()
Also, I want to keep the gradient of the variable tsf.trans, will this affect it?
It’s really strange to try and update trainable parameters while performing autograd. The whole point of autograd is to update those parameters via gradient descent. I.e. nudge them closer in a direction to get the outputs to match a specific target. But it looks like you’re updating them manually and then tracking those updates for autograd.
Additionally, the specific actions you’re taking in this example won’t result in any gradients.
Anyways, to your question, just remove the with torch.no_grad(): if you still want those tracked in autograd. Although, I can’t guarantee it won’t cause other errors.
Oh, I understand what you mean. In fact, this may be a problem with my code organization. My two models correspond to the position A of the object and the offset B of the object, respectively. I hope to optimize both parts A and B. However, when rendering the entire scene, it is necessary to first A=A+B, and then perform the loss (then A=A-B).
So in fact, I should use a new variable to represent the position of the object, rather than making modifications directly on A. (But if modifications can be made directly on A, code writing will become much easier, so I hope to be able to make direct modifications.)
So, returning to the previous question, I tried to comment thewith torch.no_grad() out, but it still didn’t work.
import torch
import torch.nn as nn
pos = [0.1, 0.2, 0.3]
pos_tens = torch.Tensor(pos)
pos_pm = nn.Parameter(pos_tens, requires_grad=True)
class Transformation(nn.Module):
def __init__(self, trans=None):
super(Transformation, self).__init__()
self.trans = nn.Parameter(trans, requires_grad=True)
def update(self, xyz):
# with torch.no_grad():
trans = self.trans.repeat(10, 1)
new = xyz + trans
return new
class Onemodel(nn.Module):
def __init__(self, xyz=None):
super(Onemodel, self).__init__()
self.xyz = nn.Parameter(xyz, requires_grad=True)
mode = Onemodel(pos_tens.repeat(10, 1))
tsf = Transformation(pos_tens)
# with torch.no_grad():
mode.xyz.data = tsf.update(mode.xyz)
y = mode.xyz.sum()
y.backward()
print(tsf.trans.grad) # None, hope it not none
The first, you need to set an optimizer in order to track gradients. After defining the model, you can call something like this:
optimizer = torch.optim.Adam(mode.parameters(), lr = 0.001)
optimizer2 = torch.optim.Adam(tsf.parameters(), lr = 0.001)
Second, what you’re doing to the model is not going to result in any gradients. Manually changing the trainable weights is not a method which gets tracked in autograd.
I will modify your example to something that does result in gradients.
import torch
import torch.nn as nn
pos = [0.1, 0.2, 0.3]
pos_tens = torch.Tensor(pos)
pos_pm = nn.Parameter(pos_tens, requires_grad=True)
class Transformation(nn.Module):
def __init__(self, trans=None):
super(Transformation, self).__init__()
self.trans = nn.Parameter(trans, requires_grad=True)
def update(self, xyz):
trans = self.trans.repeat(10, 1)
new = xyz + trans
return new
def forward(self, x): #define a forward pass
x = x*self.trans+self.trans
return x
class Onemodel(nn.Module):
def __init__(self, xyz=None):
super(Onemodel, self).__init__()
self.xyz = nn.Parameter(xyz, requires_grad=True)
def forward(self, x): #define a forward pass
x = x@self.xyz.T
return x
mode = Onemodel(pos_tens.repeat(10, 1))
tsf = Transformation(pos_tens)
#define the optimizer(s)
optimizer = torch.optim.Adam(mode.parameters(), lr = 0.001)
optimizer2 = torch.optim.Adam(tsf.parameters(), lr = 0.001)
mode.xyz.data = tsf.update(mode.xyz)
#create some data to run through the models
dummy_data = torch.rand((32, 3))
x = tsf(dummy_data)
x = mode(x)
#define targets to compare with the outputs
targets = torch.randint(0,1,(32,10)).float()
#define a loss function
loss_function = nn.L1Loss()
#calculate the loss
loss = loss_function(x, targets)
loss.backward()
print(tsf.trans.grad)
Thank you very much for your help. Comparing your code, I have found some issues with my own code (such as the Transformation class not having a forward function). At the same time, I feel that I need to learn more about the basics of pytorch, rather than just starting to modify the code. Thank you very, very, very much!
Also, if you want to understand more about the inner workings of the math involved under the hood, there is a good YouTube series titled “Neural Networks from Scratch” by sentdex.