Update a model's parameter will get an error

this is my error code:

import torch
import torch.nn as nn

pos = [0.1, 0.2, 0.3]
pos_tens = torch.Tensor(pos)
pos_pm = nn.Parameter(pos_tens, requires_grad=True)

class Transformation(nn.Module):
    def __init__(self, trans=None):
        super(Transformation, self).__init__()
        self.trans = nn.Parameter(trans, requires_grad=True)
    def update(self, xyz):
        trans = self.trans.repeat(10, 1)
        new = xyz + trans
        return new
class Onemodel(nn.Module):
    def __init__(self, xyz=None):
        super(Onemodel, self).__init__()
        self.xyz = nn.Parameter(xyz, requires_grad=True)

mode = Onemodel(pos_tens.repeat(10, 1))
tsf = Transformation(pos_tens)
mode.xyz = tsf.update(mode.xyz)
y = mode.xyz.sum()
y.backward()

when I just run it, I will get this error:

Traceback (most recent call last):
  File "/irip/yuanxuening_2022/workspace/gsgen-multi/test.py", line 22, in <module>
    mode.xyz = tsf.update(mode.xyz)
  File "/irip/yuanxuening_2022/anaconda3/envs/gsgen/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1635, in __setattr__
    raise TypeError("cannot assign '{}' as parameter '{}' "
TypeError: cannot assign 'torch.FloatTensor' as parameter 'xyz' (torch.nn.Parameter or None expected)

I don’t understand why this error occur, also have no idea about how to correct it.

Have you tried wrapping the update function with with torch.no_grad():?

def update(self, xyz):
    with torch.no_grad():
        trans = self.trans.repeat(10, 1)
        new = xyz + trans
    return new

...
with torch.no_grad():
    mode.xyz = tsf.update(mode.xyz)

Anytime you’re performing changes to trainable parameters, it’s a good idea to do so outside of autograd tracking.

1 Like

Thank you for your suggestion! I have tried the relevant methods, but they have not been effective. I just tried again and there will still be the same error.

import torch
import torch.nn as nn

pos = [0.1, 0.2, 0.3]
pos_tens = torch.Tensor(pos)
pos_pm = nn.Parameter(pos_tens, requires_grad=True)

class Transformation(nn.Module):
    def __init__(self, trans=None):
        super(Transformation, self).__init__()
        self.trans = nn.Parameter(trans, requires_grad=True)
    def update(self, xyz):
        with torch.no_grad():
            trans = self.trans.repeat(10, 1)
            new = xyz + trans
        return new
class Onemodel(nn.Module):
    def __init__(self, xyz=None):
        super(Onemodel, self).__init__()
        self.xyz = nn.Parameter(xyz, requires_grad=True)

mode = Onemodel(pos_tens.repeat(10, 1))
tsf = Transformation(pos_tens)
with torch.no_grad():
    mode.xyz = tsf.update(mode.xyz)
y = mode.xyz.sum()
y.backward()

Also, I want to keep the gradient of the variable tsf.trans, will this affect it?

Okay, try this:

with torch.no_grad():
    mode.xyz.data = tsf.update(mode.xyz)  # <<< add .data

Thank you for your help! This can indeed avoid that error, but I hope to keep the gradient of the parameter tsf.trans, which cannot be done.

print(tsf.trans.grad) # None, hope it not none

It’s really strange to try and update trainable parameters while performing autograd. The whole point of autograd is to update those parameters via gradient descent. I.e. nudge them closer in a direction to get the outputs to match a specific target. But it looks like you’re updating them manually and then tracking those updates for autograd.

Additionally, the specific actions you’re taking in this example won’t result in any gradients.

Anyways, to your question, just remove the with torch.no_grad(): if you still want those tracked in autograd. Although, I can’t guarantee it won’t cause other errors.

Oh, I understand what you mean. In fact, this may be a problem with my code organization. My two models correspond to the position A of the object and the offset B of the object, respectively. I hope to optimize both parts A and B. However, when rendering the entire scene, it is necessary to first A=A+B, and then perform the loss (then A=A-B).

So in fact, I should use a new variable to represent the position of the object, rather than making modifications directly on A. (But if modifications can be made directly on A, code writing will become much easier, so I hope to be able to make direct modifications.)

So, returning to the previous question, I tried to comment thewith torch.no_grad() out, but it still didn’t work.

import torch
import torch.nn as nn

pos = [0.1, 0.2, 0.3]
pos_tens = torch.Tensor(pos)
pos_pm = nn.Parameter(pos_tens, requires_grad=True)

class Transformation(nn.Module):
    def __init__(self, trans=None):
        super(Transformation, self).__init__()
        self.trans = nn.Parameter(trans, requires_grad=True)
    def update(self, xyz):
        # with torch.no_grad():
        trans = self.trans.repeat(10, 1)
        new = xyz + trans
        return new
class Onemodel(nn.Module):
    def __init__(self, xyz=None):
        super(Onemodel, self).__init__()
        self.xyz = nn.Parameter(xyz, requires_grad=True)

mode = Onemodel(pos_tens.repeat(10, 1))
tsf = Transformation(pos_tens)
# with torch.no_grad():
mode.xyz.data = tsf.update(mode.xyz)
y = mode.xyz.sum()
y.backward()

print(tsf.trans.grad) # None, hope it not none

There are two reasons why.

The first, you need to set an optimizer in order to track gradients. After defining the model, you can call something like this:

optimizer = torch.optim.Adam(mode.parameters(), lr = 0.001)
optimizer2 = torch.optim.Adam(tsf.parameters(), lr = 0.001)

Second, what you’re doing to the model is not going to result in any gradients. Manually changing the trainable weights is not a method which gets tracked in autograd.

I will modify your example to something that does result in gradients.

import torch
import torch.nn as nn

pos = [0.1, 0.2, 0.3]
pos_tens = torch.Tensor(pos)
pos_pm = nn.Parameter(pos_tens, requires_grad=True)

class Transformation(nn.Module):
    def __init__(self, trans=None):
        super(Transformation, self).__init__()
        self.trans = nn.Parameter(trans, requires_grad=True)

    def update(self, xyz):
        trans = self.trans.repeat(10, 1)
        new = xyz + trans
        return new

    def forward(self, x): #define a forward pass
        x = x*self.trans+self.trans
        return x

class Onemodel(nn.Module):
    def __init__(self, xyz=None):
        super(Onemodel, self).__init__()
        self.xyz = nn.Parameter(xyz, requires_grad=True)

    def forward(self, x): #define a forward pass
        x = x@self.xyz.T
        return x

mode = Onemodel(pos_tens.repeat(10, 1))
tsf = Transformation(pos_tens)

#define the optimizer(s)
optimizer = torch.optim.Adam(mode.parameters(), lr = 0.001)
optimizer2 = torch.optim.Adam(tsf.parameters(), lr = 0.001)

mode.xyz.data = tsf.update(mode.xyz)

#create some data to run through the models
dummy_data = torch.rand((32, 3))
x = tsf(dummy_data)
x = mode(x)

#define targets to compare with the outputs
targets = torch.randint(0,1,(32,10)).float()

#define a loss function
loss_function = nn.L1Loss()

#calculate the loss
loss = loss_function(x, targets)

loss.backward()

print(tsf.trans.grad)

Thank you very much for your help. Comparing your code, I have found some issues with my own code (such as the Transformation class not having a forward function). At the same time, I feel that I need to learn more about the basics of pytorch, rather than just starting to modify the code. Thank you very, very, very much!

Glad it helped. I suggest taking about an hour to run through this set of tutorials to get started:

https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html

From there, you can find more mid-level and advanced tutorials here:

https://pytorch.org/tutorials/

Also, if you want to understand more about the inner workings of the math involved under the hood, there is a good YouTube series titled “Neural Networks from Scratch” by sentdex.