w1 = nn.Parameter(torch.log(torch.Tensor([1.]))).to(device)
w2 = nn.Parameter(torch.log(torch.Tensor([1.]))).to(device)
y = w2 * x
>>> y
tensor([0.], grad_fn=<MulBackward0>)

Clearly y only depends on w2 not w1. How do I know when the network is much more complicated?

I would suggest to set all the gradients to 0 then run a backward on your output. If itâ€™s a Tensor, you can backward a tensor full of ones of the same size as the output. Then check the value of each parametersâ€™ gradients. The ones that are non zero have an impact on the output.