Backpropagation does not update weights in unsupervised learning

cherepashkin · July 20, 2020, 12:06pm

NN receives images of circles with different radii at the input, the center of the circle is in the center of the picture. The task is to estimate radius of the circle, then using image generator create an image with found radius. Finally, one needs to calculate loss between input and predicted image.

I understand that this task can be solved analytically without NN. I just want to perform this test task in order to understand how does unsupervised learning work in practice.

The problem is that during learning weights are not updating, so I have constant radii of circles and constant loss. I checked grad_fn for intermediate variables I and r, it was not “None”.

There is also supervised model, where loss is calculating between predicted radius and original radius used to generate the image. In this model loss is decreased and estimated radii are changing every epoch.

Also I tried to use convolutional NN or DNN with more layers or bigger layer size, but loss was constant again.

Please, suggest me, where can the problem be.

#creates initial tensors for the image
def figure_init(imsize):
    X = torch.arange(0, imsize, 1, dtype=torch.float64, requires_grad=True)
    Y = torch.arange(0, imsize, 1, dtype=torch.float64, requires_grad=True)
    X, Y = torch.meshgrid(X, Y)
    return(X, Y)

def center_circle_gen(r0):
    X, Y = figure_init(200)
    r = torch.sqrt(((X-0.5*imsize)**2 + (Y-0.5*imsize)**2))
    a = 1
    #use exponentially attenuating intensity to make circle smooth
    I = 1/(1+a*torch.exp(r/r0/imsize))
    return(I)

class DNet(torch.nn.Module):
    def __init__(self, C_in, D_out):
        super(DNet, self).__init__()
        self.linear1 = torch.nn.Linear(200*200*C_in, 100)
        self.linear2 = torch.nn.Linear(100,100)
        self.linear3 = torch.nn.Linear(100,100)
        self.linear4 = torch.nn.Linear(100, D_out)

model = DNet(1, 1) #1 color channel in input, 1 radius in output
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

for t in range(280):
    xtrain = xtrain.detach() #xtrain is an image with a single circle 
                             #in the center of the image 
    y_pred = model(xtrain) #y_pred contains estimated 
                           #radii of the circles
    x_pred = torch.zeros([b,1,imsize,imsize])
    for i in range(b):
        # Generating circles with image generator 
        #using estimated radii y_pred.
        x_pred[i,0,:,:] = center_circle_gen(y_pred[i][0].item())
    # Compare generated image with the input image, 
    # finding per-pixel distance between two images
    loss = criterion(x_pred, xtrain) 


    # Print loss every 10 epochs
    if t % 10 == 9:
        print(t, loss.item())

    # Zero gradients, perform a backward pass, and update the weights.
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

albanD · July 20, 2020, 4:48pm

Hi,

When you do .item() on the y_pred element, you actually convert the Tensor into a python float for which we cannot track gradients.
So no gradient can flow back all the way to your model.

You will need to make sure center_circle_gen takes a Tensor as input and generates the output in a differentiable manner if you want to be able to use it. Not sure how you can do that though.