How to element-wise multiply two network yet still use autograd?

wasabi · October 11, 2020, 9:11pm

Hi guys. Let’s say I have two one-layer network w1 and w2, they have the same shape. I want the classification rule to be (w1\odot w2)^T x, where \odot just means element-wise multiplication. My question is, how can I write such code, s.t pytorch can still use autograd to update the weights? Thanks.

KFrank · October 12, 2020, 12:45am

Hi Wasabi!

Just write the code in a straightforward way (using pytorch tensor
operations) and autograd will work as long as w1 and w2 have
requires_grad = True.

Here’s an example script:

import torch
torch.__version__

l1 = torch.nn.Linear (3, 1)
l2 = torch.nn.Linear (3, 1)
w = l1.weight * l2.weight
input = torch.randn (3)
loss = torch.matmul (w, input)
loss
loss.backward()
l1.weight.grad
l2.weight.grad

Here is the output:

>>> import torch
>>> torch.__version__
'1.6.0'
>>>
>>> l1 = torch.nn.Linear (3, 1)
>>> l2 = torch.nn.Linear (3, 1)
>>> w = l1.weight * l2.weight
>>> input = torch.randn (3)
>>> loss = torch.matmul (w, input)
>>> loss
tensor([-0.2095], grad_fn=<MvBackward>)
>>> loss.backward()
>>> l1.weight.grad
tensor([[-0.1094, -0.3962, -0.3002]])
>>> l2.weight.grad
tensor([[-0.1291,  0.2055, -0.4207]])

(The weight property of a Linear has requires_grad = True by
default, and * performs element-wise multiplication on tensors.)

Best.

K. Frank