I can't understand why one of my parameters have None grad

Jkjk1234 · October 5, 2023, 12:55pm

Hi, I tried searching everywhere, but I can’t seem to find a way to make it work for me, I have a code in which I have to optimize some parameters so I call the backward of the loss fucntion I’m using, but I always get None gradient for the last two parameters I’m trying to optimize. Now I thought that it could be due to the torch.tensor call I use to assemble the tensors, but I don’t know how to avoid it, can anyone help me sort out why this happens and how can I do to fix this?? Thanks a lot in advance. Following I post my code so that you can understand what I’m talking about

 in_color = torch.tensor([0.5, 0.5, 0.5])
 angle = torch.tensor(0.763965)
 angle.requires_grad_(True)
 in_color.requires_grad_(True)
 params.append(angle)
 params.append(in_color)

def check_hue(myHue):
    if myHue < 0:
        myHue = myHue + 360 * abs(myHue // 360)
    elif myHue >= 360:
        myHue = myHue - 360 * (myHue // 360)
    return myHue

def HSLtoRGB(myColor):
    with torch.no_grad():
        myColor[0] = check_hue(myColor[0])

    C = (1 - abs(2 * myColor[2] - 1)) * myColor[1]
    X = C * (1 - abs((myColor[0].data / 60) % 2 - 1))
    m = myColor[2] - C / 2
   
    if 0 <= myColor[0] < 60:
        myRGB = torch.tensor([C, X, 0]) + m
    elif 60 <= myColor[0] < 120:
        myRGB = torch.tensor([X, C, 0]) + m
    elif 120 <= myColor[0] < 180:
        myRGB = torch.tensor([0, C, X]) + m
    elif 180 <= myColor[0] < 240:
        myRGB = torch.tensor([0, X, C]) + m
    elif 240 <= myColor[0] < 300:
        myRGB = torch.tensor([X, 0, C]) + m
    elif 300 <= myColor[0] < 360:
        myRGB = torch.tensor([C, 0, X]) + m

    return myRGB

def randomAngle(myColor, angle):
    #myHSL = RGBtoHSL(myColor)
    with torch.no_grad():
        myColor[0] = myColor[0] * 360
    percent = 0.1

    HSL1 = torch.tensor([myColor[0] - angle * 360, myColor[1], myColor[2]])
    HSL2 = torch.tensor([myColor[0] - angle * 180, myColor[1], myColor[2]])
    HSL3 = myColor
    HSL4 = torch.tensor([myColor[0] + angle * 180, myColor[1], myColor[2]])
    HSL5 = torch.tensor([myColor[0] + angle * 360, myColor[1], myColor[2]])

    with torch.no_grad():
        c1 = HSLtoRGB(HSL1)
        c2 = HSLtoRGB(HSL2)
        c3 = HSLtoRGB(HSL3)
        c4 = HSLtoRGB(HSL4)
        c5 = HSLtoRGB(HSL5)

    c1b = c1 * (1 - percent) + myColor * percent
    c2b = c2 * (1 - percent) + myColor * percent
    c3b = c3 * (1 - percent) + myColor * percent
    c4b = c4 * (1 - percent) + myColor * percent
    c5b = c5 * (1 - percent) + myColor * percent

    # create vector of colors
    colors = [c1b, c2b, c3b, c4b, c5b]

    return colors

params[2] = randomAngle(params[-1], params[-2])[x[j]]
params[3] = randomAngle(params[-1], params[-2])[y[j][i]]

In this code I show the initialization of my parameters, then the “manipulation” of them and where I use them. The code is incomplete because is to long to be shown in its integrity, but I’m not touching my parameters elsewhere. The rest of the code is the same for every parameters (I have many others in the vector params, and for every of them I have a value of the gradient different than None, except for in_color and angle.

soulitzer · October 5, 2023, 2:58pm

Hi, calling torch.tensor will indeed prevent gradients from flowing through. If you wanted to generate HSL from myColor, you could do something like:

HSL = myColor.clone()
HSL[0] -= angle * 360

similarly for myRGB,

myRGB = m.expand((3,)).clone()
myRGB[0] += C
myRGB[1] += X

Jkjk1234 · October 5, 2023, 9:58pm

Thanks for your answer it helped me solve the problem of the None gradient for params[-1] (I removed the “with no.grad()” line before the conversion from HSL to RGB as it was necessary for the calculation of the gradient). This solved one of the issues, but not all as angle (which is params[-2]) still gets None gradient. At this point I really don’t know what could be the problem, the only one I thought of is that for some reason it is not part of the computational graph, but I do not know why. Angle is used in the different colors generation (HSL1, HSL2, HSL4, HSL5) and they are associated randomly to params[2] and params[3] which are actively used in what the loss is evaluated on, also angle should be a leaf tensor, so it should not rquire retain_grad, I think.

soulitzer · October 5, 2023, 10:35pm

This would also break the gradient.

Also angle should be a leaf tensor, so it should not rquire retain_grad, I think.

that is correct.

Jkjk1234 · October 5, 2023, 10:39pm

Yeah you’re right thanks, I omitted to say that I also removed all the “.data”, but still can’t get the gradient for the angle parameter