Loss.backward()

class Mymodel(nn.Module):
def init(self:
super(Mymodel, self).init()

    self.conv1 = nn.Conv2d(...)
    self.conv2= Myconv(...)

def forward(self, x, conv2=False):
    
    if conv2:
        self.conv2.weight.data = self.conv1.weight.data    
        out  = self.conv2(x)
    else:
        out  = self.conv1(x)
    return out

model = Mymodel()
output = model(input, conv2=True)
In the above example, I want to do a forward pass in which my input is fed to conv2 instance and in backward pass, the weghts for conv1 needs to be updated. But when I do a loss.backward(), it is actually going to the backward function for conv2 instead I want it to go to backward function for conv1.

Is it possible that when I do loss.backward() it goes through the backward function for conv1?

Hi,

You should not use .data anymore. No valid use case for it exists.

it is actually going to the backward function for conv2 instead I want it to go to backward function for conv1.

The backward will match what happened during the forward pass. So if conv2 was used, it will be used during the backward.

1 Like

Hi @albanD
Thanks for the reply!

conv2 (Myconv) is a class created by me using “Function” class in pytorch. Below is an example:

class Conv2d_mvm_function(Function):

@staticmethod
def forward(ctx, input, weight, bias):
    ...
    
    ...
    ctx.save_for_backward(input, weight, bias)
    
@staticmethod
def backward(ctx, grad_output):
    
    input, weight, bias = ctx.saved_tensors

So when I do a loss.backward() it goes to line “ctx.saved_tensors” in my backward function, but I keep on getting this error “*** RuntimeError: No grad accumulator for a saved leaf!”

What is the reason for this error?
The require grad for input and weight is True.

Also i am using double backward. Is there any problem with double backward?

What is the other way other than using *.data to make weights for self.conv2 to be equal to weights of self.conv1?

No grad accumulator for a saved leaf!

How do you get this error? Do you still use .data?
Do you always have a bias? Does it require gradients? Do you return gradient in the backward for it all the time?

What is the other way other than using *.data to make weights for self.conv2 to be equal to weights of self.conv1?

Is there a reason why they are two different Tensors? If you just want them to have the same weight Tensor, just do that in the init by creating MyConv with self.conv1.weight or by passing this to your custom forward directly.

“No grad accumulator for a saved leaf!”
I solved this error. I made requires_gard=False for the parameters for which I didn’t want to calculate the grads

“What is the other way other than using *.data to make weights for self.conv2 to be equal to weights of self.conv1?”
I have two tensors because I train with nn.conv2d and I infer with my conv implementation. What is the most efficient way for conv2 and conv1 to share weights for all the epochs. Basically the change in weight for conv1 should reflect in conv2 weight as well. I tried using the below line in my init:

self.conv2.weight.data = self.conv1.weight.detach()

But I think this would only make weight equal when init is called while initialising the model. The changes in conv1.weight won’t reflect on conv2 weights in subsequent epochs.

One way is to make the weight of conv2 and conv1 equal in forward function. But is there some other way to do so?

As I mentioned above, you just want a single Parameter for both to make sure they remain the same.
Doing self.conv2.weight = self.conv1.weight will do that.

Is there a better way to make require_grad=False for all the parameters of conv2 in init method itself?

Currently I am making them False by individually making require grad=False for each parameter of the layer.

like self.conv2.weight=False
self.bn2.weight = False
…
…

You can do mod.requires_grad_(False) on a nn.Module to set the requires grad field of all his parameters to False.

Hey @albanD
Will the below line allocate new memory for the weights for self.conv2 or it will just use the weights from the memory where self.conv1 weight is saved.
" self.conv2.weight = self.conv1.weight"

Yes this will just set the .weight attribute to the Parameter contained in the other module. So they will point to the same Tensor.

So basically if any one of them (conv1.weight or conv2.weight) is updated it will be reflected in the other instance?

Yes it will be reflected.

1 Like

@albanD would both the options below make conv1 and conv2 share same weights?
What I observed is that with option2 when I am updating conv1 it’s not reflected in conv2?
Not sure what is the reason for that

Option1:
class Mymodel1(nn.Module):
def init (self:
super(Mymodel, self). init ()

    self.conv1 = nn.Conv2d(...)
    self.conv2= nn.Conv2d(.....)
    self.conv2.weight = self.conv1.weight

Option2:
class Mymodel2(nn.Module):
def init (self:
super(Mymodel, self). init ()

    self.conv1 = nn.Conv2d(...)
    self.conv2= nn.Conv2d(...)

mymodel = Mymodel2()

mymodel.conv2.weight = mymodel.conv1.weight

In option2 when I did mymodel.conv2.weight.data = mymodel.conv1.weight that actually helped to do what I was able to do with option1. But still I am curious why wouldn’t mymodel.conv2.weight =mymodel.conv1.weight won’t work in option2.

It should.
Can you share a small script that shows how the two give different behavior?