Here is the situation I encounter

alpha = Variable(torch.Tensor([4]),requires_grad=True)

f = lambda x:x*alpha**2

y = f(torch.Tensor([1,2,3]))

y

We will have y = tensor([ 16., 32., 48.]). Now, I do derivative

shape = torch.ones(y.size()) # y.size()=3 because we have three output here

y.backward(shape)

Now since we take gradient w.r.t. alpha. We have df = 2 * x * alpha

Now when I call

alpha.grad

I’m expecting results tensor([ 8., 16., 32.])

But instead I get

tensor([ 48.])

I couldn’t figure out why the output is like this; is there anyway to get what I expect? Thanks