How to copy custom weights tensor into model layer weights

Andoni_Cortes · May 25, 2020, 5:36pm

Hello everyone,

I am trying to copy a tensor with custom weights into an AlexNet convolutional layer weights. I am doing as indicated below and I dont get any error. However when I load the model using torch::load and check the weights, they are different than the values that I saved with torch::save function. Am I missing something?

AlexNet model;

 model->named_parameters()["conv1.weights"] = tensor_with_custom_weights.clone().detach();

// if I check it here weights are correct, the ones that I assigned
torch::save("alexnet.pt",model);

AlexNet model1;
torch::load("alexnet.pt",model1);
//Here model1.conv1.weights are random

Thanks

ptrblck · May 26, 2020, 7:31am

After you’ve reassigned the tensor to conv1.weights, could you check, if these parameters are shown via:

for (const auto& pair : net.named_parameters()) {
  std::cout << pair.key() << ": " << pair.value() << std::endl;
}

Usually, the attribute is called weight (without the s at the end), so maybe you are just creating a new key in the map?

Andoni_Cortes · May 26, 2020, 8:05am

sorry, just a typo. I meant

model->named_parameters()["conv1.weight"] = tensor_with_custom_weights.clone().detach();

my model structure is this:

AlexNetImpl(
  (linear1): torch::nn::Linear(in_features=9216, out_features=4096, bias=true)
  (linear2): torch::nn::Linear(in_features=4096, out_features=4096, bias=true)
  (linear3): torch::nn::Linear(in_features=4096, out_features=21, bias=true)
  (dropout): torch::nn::Dropout(p=0.5, inplace=false)
  (conv1): torch::nn::Conv2d(3, 96, kernel_size=[11, 11], stride=[4, 4])
  (conv2): torch::nn::Conv2d(96, 256, kernel_size=[5, 5], stride=[1, 1], padding=[2, 2])
  (conv3): torch::nn::Conv2d(256, 384, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
  (conv4): torch::nn::Conv2d(384, 384, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
  (conv5): torch::nn::Conv2d(384, 256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
  (lrnorm1): torch::nn::LocalResponseNorm(5, alpha=0.0001, beta=0.75, k=2)
)

and this is the output of the for-loop (instead of tensor values, just the sizes) :

Parameters of the model: AlexNetImpl
-----------------------------------------
   > linear1.weight [4096, 9216]
   > linear1.bias [4096]
   > linear2.weight [4096, 4096]
   > linear2.bias [4096]
   > linear3.weight [21, 4096]
   > linear3.bias [21]
   > conv1.weight [96, 3, 11, 11]
   > conv1.bias [96]
   > conv2.weight [256, 96, 5, 5]
   > conv2.bias [256]
   > conv3.weight [384, 256, 3, 3]
   > conv3.bias [384]
   > conv4.weight [384, 384, 3, 3]
   > conv4.bias [384]
   > conv5.weight [256, 384, 3, 3]
   > conv5.bias [256]

but still I don’t get the correct values after loading the model.
by the way, thanks for the help.

Andoni_Cortes · May 27, 2020, 7:05am

Hello again, I tried another model modification. I am using torchvision alexnet model

AlexNetImpl(
  (features): torch::nn::Sequential(
    (0): torch::nn::Conv2d(3, 64, kernel_size=[11, 11], stride=[4, 4], padding=[2, 2])
    (1): torch::nn::Functional()
    (2): torch::nn::Functional()
    (3): torch::nn::Conv2d(64, 192, kernel_size=[5, 5], stride=[1, 1], padding=[2, 2])
    (4): torch::nn::Functional()
    (5): torch::nn::Functional()
    (6): torch::nn::Conv2d(192, 384, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
    (7): torch::nn::Functional()
    (8): torch::nn::Conv2d(384, 256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
    (9): torch::nn::Functional()
    (10): torch::nn::Conv2d(256, 256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
    (11): torch::nn::Functional()
    (12): torch::nn::Functional()
  )
  (classifier): torch::nn::Sequential(
    (0): torch::nn::Dropout(p=0.5, inplace=false)
    (1): torch::nn::Linear(in_features=9216, out_features=4096, bias=true)
    (2): torch::nn::Functional()
    (3): torch::nn::Dropout(p=0.5, inplace=false)
    (4): torch::nn::Linear(in_features=4096, out_features=4096, bias=true)
    (5): torch::nn::Functional()
    (6): torch::nn::Linear(in_features=4096, out_features=1000, bias=true)
  )
)

I load it with torch::load and modify last output layer nodes number in this way:

AlexNet model_from_torchvision;
torch::load(model_from_torchvision, "alexnet.pt");

model_from_torchvision.get()->named_modules()["classifier"]->unregister_module("6");
model_from_torchvision.get()->named_modules()["classifier"]->register_module("6", torch::nn::Linear(4096, 12));

The model seems to have been modified (12 nodes in the output layer):

AlexNetImpl(
  (features): torch::nn::Sequential(
    (0): torch::nn::Conv2d(3, 64, kernel_size=[11, 11], stride=[4, 4], padding=[2, 2])
    (1): torch::nn::Functional()
    (2): torch::nn::Functional()
    (3): torch::nn::Conv2d(64, 192, kernel_size=[5, 5], stride=[1, 1], padding=[2, 2])
    (4): torch::nn::Functional()
    (5): torch::nn::Functional()
    (6): torch::nn::Conv2d(192, 384, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
    (7): torch::nn::Functional()
    (8): torch::nn::Conv2d(384, 256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
    (9): torch::nn::Functional()
    (10): torch::nn::Conv2d(256, 256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
    (11): torch::nn::Functional()
    (12): torch::nn::Functional()
  )
  (classifier): torch::nn::Sequential(
    (0): torch::nn::Dropout(p=0.5, inplace=false)
    (1): torch::nn::Linear(in_features=9216, out_features=4096, bias=true)
    (2): torch::nn::Functional()
    (3): torch::nn::Dropout(p=0.5, inplace=false)
    (4): torch::nn::Linear(in_features=4096, out_features=4096, bias=true)
    (5): torch::nn::Functional()
    (6): torch::nn::Linear(in_features=4096, out_features=12, bias=true)
  )
)

Now I simply save the model and then I load it into a new object.

torch::save(model_from_torchvision, "savedmodel.pt");

AlexNet model_tmp;
torch::load(model_tmp, "savedmodel.pt");

But model_tmp structure is not the modified one but the original one (with 1000 nodes in the output layer).

AlexNetImpl(
  (features): torch::nn::Sequential(
    (0): torch::nn::Conv2d(3, 64, kernel_size=[11, 11], stride=[4, 4], padding=[2, 2])
    (1): torch::nn::Functional()
    (2): torch::nn::Functional()
    (3): torch::nn::Conv2d(64, 192, kernel_size=[5, 5], stride=[1, 1], padding=[2, 2])
    (4): torch::nn::Functional()
    (5): torch::nn::Functional()
    (6): torch::nn::Conv2d(192, 384, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
    (7): torch::nn::Functional()
    (8): torch::nn::Conv2d(384, 256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
    (9): torch::nn::Functional()
    (10): torch::nn::Conv2d(256, 256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1])
    (11): torch::nn::Functional()
    (12): torch::nn::Functional()
  )
  (classifier): torch::nn::Sequential(
    (0): torch::nn::Dropout(p=0.5, inplace=false)
    (1): torch::nn::Linear(in_features=9216, out_features=4096, bias=true)
    (2): torch::nn::Functional()
    (3): torch::nn::Dropout(p=0.5, inplace=false)
    (4): torch::nn::Linear(in_features=4096, out_features=4096, bias=true)
    (5): torch::nn::Functional()
    (6): torch::nn::Linear(in_features=4096, out_features=1000, bias=true)
  )
)

Why could this be?
It seems that any change I make from outside the model is not persistent and it is lost after saving.
Thanks.

ptrblck · May 27, 2020, 7:47am

That’s interesting and it seems that maybe only the parameters were loaded? Although I would assume that this would raise an error, as you would have a shape mismatch in the last linear layer.

Note that I don’t use the C++ frontend for training, so I’m not sure how the serialization is working correctly and I usually just stick to the tests to see a proper usage.
However, I cannot find a comparable test for the fine-tuning use case.

I think @yf225 would know, what’s going on.