I want to implement a projected gradient descent algorithm in pytorch. I have a weight matrix C such that -

C = Variable(torch.ones(BATCH_SIZE[0],BATCH_SIZE[1]).cuda(), requires_grad=True). While training I do the following -

sum_loss.backward()

optimizer.step()

C=models.project( C )

Here, sum_loss is the total loss I plan to optimize. After the optimization step optimizer.step() I use my customized project() to project C into the constraint set. However, I get the following error -

RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.

Is there any way to get rid of this problem so that I can implement projected gradient descent successfully ?