Hi guys. Let’s say I have two one-layer network w1 and w2, they have the same shape. I want the classification rule to be (w1\odot w2)^T x, where \odot just means element-wise multiplication. My question is, how can I write such code, s.t pytorch can still use autograd to update the weights? Thanks.
1 Like
Hi Wasabi!
Just write the code in a straightforward way (using pytorch tensor
operations) and autograd will work as long as w1
and w2
have
requires_grad = True
.
Here’s an example script:
import torch
torch.__version__
l1 = torch.nn.Linear (3, 1)
l2 = torch.nn.Linear (3, 1)
w = l1.weight * l2.weight
input = torch.randn (3)
loss = torch.matmul (w, input)
loss
loss.backward()
l1.weight.grad
l2.weight.grad
Here is the output:
>>> import torch
>>> torch.__version__
'1.6.0'
>>>
>>> l1 = torch.nn.Linear (3, 1)
>>> l2 = torch.nn.Linear (3, 1)
>>> w = l1.weight * l2.weight
>>> input = torch.randn (3)
>>> loss = torch.matmul (w, input)
>>> loss
tensor([-0.2095], grad_fn=<MvBackward>)
>>> loss.backward()
>>> l1.weight.grad
tensor([[-0.1094, -0.3962, -0.3002]])
>>> l2.weight.grad
tensor([[-0.1291, 0.2055, -0.4207]])
(The weight
property of a Linear
has requires_grad = True
by
default, and *
performs element-wise multiplication on tensors.)
Best.
K. Frank
1 Like