I have a custom implementation of Conv2D and I want to add a custom value to this model. Essentially, I want to set this value at the model level and then have access to it within MyConv2D. I tried to do this by adding a new parameter to the model definition and then passing that value my layer, like this:
Not only does this seem like a really inelegant solution but it also doesn’t work! Because I initialize custom_value with 0, it always stays zero. And that’s the point where I’ve exhausted my knowledge of PyTorch!
So is there a nice way for me to set a property (parameter or otherwise) at a model level and have access to it in my own layer code? I am not planning to train this model. I just load weights from a pre-trained model and only run inference.
I am OK to have it as a parameter. If I can directly access it within the MyConv2D layer, that’s all I need.
If I can use some other property of a model (I’ve seen mention of a buffer but not sure if that is a better solution here). But I would like an easy way to set this property before I run the model, as I need to change this property for each run.
Question: Why not pass the custom_value while calling the self.conv_1 in custom_model’s forward()? This way, either you can set the custom_value at each run before forward() call or directly parameterize the forward() with custom_value.
Then custom_value is not expected to get gradients?
If so, it doesn’t need to be nn.Parameter().
By the time you are trying to set custom_value, self.conv_1 have been already created with its old value. (i.e., 0)
Yes, I can set it as a parameter directly, but then I would have to do this for each layer right? That’s why i was seeing if there is a way to set something at the model level that I can access within each layer.
No, like I said in my post, I was just using parameter since I can set it easily using state_dict(). Again, it would be ideal if it is something I can set about the model before I run it. But doesn’t have to be parameter no. As long as I can set it easily.
Yes, so the question is is there something else I could use instead of custom_value here to pass to my MyConv2D layer. So one thing you mentioned was:
Do you mean, just pass an additional parameter to my MyConv2D.forward()? I can do that too but it would not be anything associated with my model then right? If there isn’t a way to do this, that’s probably what I’ll end up doing.
I am not sure of your usecase and why do you take this route of modifying with state_dict().
There are two things that you could try:
Try passing self.custom_value without .item(). i.e., pass the reference of the object so that if you change the values, it gets reflected in self.conv_1.
Instead of dealing with state_dict, why not create a method to set custom_value dynamically?
and set it as self.custom_value[0] = <new val> without creating new tensor.
I tried that already but unfortunately, it then passes it as a paramter to my layer so then I need to also add MyConv2D.custom_value as well, which would just be a copy of the same custom_value in each layer, which is why I tried to only pass the value instead of the parameter.
Could you please elaborate on this a bit? so this method would just return the value I want to set? Where should I call this line? Before I run my network? And for this, would it be OK if I made custom_value just a tensor and not a parameter (since it doesn’t need grad anyway).