How to set the value of model parameter

michaelklachko · April 12, 2019, 11:44pm

I want to modify a value of a model parameter, test the accuracy, then restore the original value:

#save the parameter values
orig_params = []
for n, p in model.named_parameters():
	if n == 'conv1.weight':
		print('\nInitial values:\np: {}'.format(p.data.detach().cpu().numpy()[0, 0, :1]))
		print('conv1.weight.data: {}'.format(model.conv1.weight.data.detach().cpu().numpy()[0, 0, :1]))
	orig_params.append(p.clone())

#modify conv1 weights:
model.conv1.weight.data.add_(torch.cuda.FloatTensor(model.conv1.weight.size()).uniform_(-0.05, 0.05))

#restore the original values:
for orig_p, (n, p) in zip(orig_params, model.named_parameters()):
	if n == 'conv1.weight':
		print('\nAfter modifying conv1.weight.data:\np: {}'.format(p.data.detach().cpu().numpy()[0, 0, :1]))
		print('conv1.weight.data: {}'.format(model.conv1.weight.data.detach().cpu().numpy()[0, 0, :1]))
		print('orig_p: {}'.format(orig_p.data.detach().cpu().numpy()[0, 0, :1]))

	p = orig_p.clone()

	if n == 'conv1.weight':
		print('\nAfter restoring p:\np: {}'.format(p.data.detach().cpu().numpy()[0, 0, :1]))
		print('conv1.weight.data: {}'.format(model.conv1.weight.data.detach().cpu().numpy()[0, 0, :1]))
		print('orig_p: {}'.format(orig_p.data.detach().cpu().numpy()[0, 0, :1]))

Note that to modify weights of conv1 layer I’m using model.conv1.weight.data, but to restore the original value I’m using the corresponding element in model.named_parameters() list.

For some reason, this does not work:

Initial values:
p:                    [[-0.045 -0.048 -0.05  -0.089 -0.085]]
conv1.weight.data:    [[-0.045 -0.048 -0.05  -0.089 -0.085]]

After modifying conv1.weight.data:
p:                    [[-0.019 -0.014 -0.057 -0.056 -0.099]]
conv1.weight.data:    [[-0.019 -0.014 -0.057 -0.056 -0.099]]
orig_p:               [[-0.045 -0.048 -0.05  -0.089 -0.085]]

After restoring p:
p:                    [[-0.045 -0.048 -0.05  -0.089 -0.085]]
conv1.weight.data:    [[-0.019 -0.014 -0.057 -0.056 -0.099]]
orig_p:               [[-0.045 -0.048 -0.05  -0.089 -0.085]]

You can see that even though p and conv1.weight should point to the same memory, and changing conv1.weight.data does change p.data, changing p does not change conv1.weight.data. I’d like to understand what’s going on here.

MariosOreo · April 13, 2019, 3:06am

Hello,

.clone() is not a shared memory operation.
In the first for loop, you clone the conv1.weight to orig_params, here conv1.weight and p point to the same memory, and orig_params points to another memory.
Then changing the value of p and conv1.weight together, because they are sharing the same memory.
In the second loop before clone to p, p and conv1.weight still share the same memory, but after cloning the case becomes different, p and conv1.weight don’t use the same memory, so modify the value of p will not influnce conv1.weight's value.
Additionally, you could add A is B in each if-clause to check if they point to the same memory. I have checked that:

# the first if
p is conv1.weight -> True
# the second if
p is conv1.weight -> True
# the third if
p is conv1.weight -> False

michaelklachko · April 13, 2019, 3:41am

Thanks, but that’s pretty much what I’m trying to understand: why is changing p, which is a model parameter, which is the same object as model.conv1.weight, does not change model.conv1.weight?

Why is there a difference between assigning something to p and assigning something to model.conv1.weight?

MariosOreo · April 13, 2019, 5:17am

Because after .clone() in the second loop, p and model.conv1.weight point to different memory, so modifying p doesn’t influence value of model.conv1.weight.

But at the modification between two loops, p and model.conv1.weight still point to the same memory, so changing the value of model.conv1.weght will change the value of p as well.