Setting custom paramter in my own model

I have a custom implementation of Conv2D and I want to add a custom value to this model. Essentially, I want to set this value at the model level and then have access to it within MyConv2D. I tried to do this by adding a new parameter to the model definition and then passing that value my layer, like this:

class custom_model(nn.Module):
  def __init__(self):
    super(custom_model, self).__init__()
    self.custom_value = torch.nn.Parameter(torch.zeros(1), requires_grad=False)
    self.conv_1 = MyConv2d((self.custom_value.item()), 1, 4, 5, stride=1, padding=0)

Then when I load the model, I first add this new parameter to my state_dict:

copied_state_dict = torch.load(PATH, map_location=device)
copied_state_dict['custom_value'] = torch.tensor([1])

Not only does this seem like a really inelegant solution but it also doesn’t work! Because I initialize custom_value with 0, it always stays zero. And that’s the point where I’ve exhausted my knowledge of PyTorch!

So is there a nice way for me to set a property (parameter or otherwise) at a model level and have access to it in my own layer code? I am not planning to train this model. I just load weights from a pre-trained model and only run inference.

  1. I am OK to have it as a parameter. If I can directly access it within the MyConv2D layer, that’s all I need.
  2. If I can use some other property of a model (I’ve seen mention of a buffer but not sure if that is a better solution here). But I would like an easy way to set this property before I run the model, as I need to change this property for each run.


Question: Why not pass the custom_value while calling the self.conv_1 in custom_model's forward()? This way, either you can set the custom_value at each run before forward() call or directly parameterize the forward() with custom_value.

Then custom_value is not expected to get gradients?
If so, it doesn’t need to be nn.Parameter().

By the time you are trying to set custom_value, self.conv_1 have been already created with its old value. (i.e., 0)

Yes, I can set it as a parameter directly, but then I would have to do this for each layer right? That’s why i was seeing if there is a way to set something at the model level that I can access within each layer.

No, like I said in my post, I was just using parameter since I can set it easily using state_dict(). Again, it would be ideal if it is something I can set about the model before I run it. But doesn’t have to be parameter no. As long as I can set it easily.

Yes, so the question is is there something else I could use instead of custom_value here to pass to my MyConv2D layer. So one thing you mentioned was:

Do you mean, just pass an additional parameter to my MyConv2D.forward()? I can do that too but it would not be anything associated with my model then right? If there isn’t a way to do this, that’s probably what I’ll end up doing.

I am not sure of your usecase and why do you take this route of modifying with state_dict().

There are two things that you could try:

Try passing self.custom_value without .item(). i.e., pass the reference of the object so that if you change the values, it gets reflected in self.conv_1.

  1. Instead of dealing with state_dict, why not create a method to set custom_value dynamically?
    and set it as self.custom_value[0] = <new val> without creating new tensor.

I tried that already but unfortunately, it then passes it as a paramter to my layer so then I need to also add MyConv2D.custom_value as well, which would just be a copy of the same custom_value in each layer, which is why I tried to only pass the value instead of the parameter.

Could you please elaborate on this a bit? so this method would just return the value I want to set? Where should I call this line? Before I run my network? And for this, would it be OK if I made custom_value just a tensor and not a parameter (since it doesn’t need grad anyway).

I meant something like this:

Thank you, I think this will work for me. ̶B̶u̶t̶ ̶i̶s̶ ̶t̶h̶e̶r̶e̶ ̶a̶ ̶n̶i̶c̶e̶r̶ ̶w̶a̶y̶ ̶t̶o̶ ̶s̶e̶t̶ ̶̶m̶.̶c̶u̶s̶t̶o̶m̶_̶v̶a̶l̶[̶0̶]̶̶ ̶h̶e̶r̶e̶?̶ ̶I̶ ̶c̶a̶n̶ ̶k̶e̶e̶p̶ ̶m̶y̶ ̶c̶u̶s̶t̶o̶m̶_̶v̶a̶l̶u̶e̶ ̶a̶s̶ ̶t̶h̶e̶ ̶f̶i̶r̶s̶t̶ ̶e̶l̶e̶m̶e̶n̶t̶ ̶b̶u̶t̶ ̶i̶s̶ ̶t̶h̶e̶r̶e̶ ̶a̶ ̶n̶i̶c̶e̶r̶ ̶w̶a̶y̶ ̶t̶o̶ ̶s̶e̶t̶ ̶t̶h̶i̶s̶ ̶s̶o̶ ̶t̶h̶a̶t̶ ̶i̶f̶ ̶̶c̶u̶s̶t̶o̶m̶_̶v̶a̶l̶u̶e̶̶ ̶i̶s̶n̶’̶t̶ ̶t̶h̶e̶ ̶f̶i̶r̶s̶t̶ ̶e̶l̶e̶m̶e̶n̶t̶,̶ ̶i̶t̶ ̶w̶o̶u̶l̶d̶ ̶s̶t̶i̶l̶l̶ ̶w̶o̶r̶k̶?̶

EDIT: My mistake. I thought this was to access to the first element of the model, instead of just a way to access the tensor value!