Tensor after .clone().detach() continue to evolve

barthelemymp · July 23, 2019, 6:21pm

Hello everyone,

In order to train and visualize some optimization I’m working on I adopt the following method:
-I define a model,
-save the value of 2 parameters,
-train the model,
-put the requires_grad to flase for all my parameters except the 2 I have saved,
-then for different optimization mode I put again the two parameters to their initial value and retrain the model.

Here is the following code:

init = model.QC_optvar2.clone().detach()  # I save the init value
print('init', init)
# remaining hyperparameters
loss_fn = torch.nn.CrossEntropyLoss()
n_epoch = 1

#getting to an optimal point by training all the parameters
train_losses, valid_losses, valid_acc = train(model, train_dataset, test_dataset, loss_fn, n_epoch)


weights_list = []
loss_evol_list = []
shots = [0, 3, 256, 1024]
for sh in shots:
    print('shots', sh, 'init', init)
    print(model.QC_optvar2.data)
    model.QC_optvar2.data = init
    print(model.QC_optvar2.data)
    model.QC_optvar1.requires_grad_(False)
    model.QE_optvar.requires_grad_(False)
    print(sh, 'shots')
    weights, loss_evol = trainDSGD(model, train_dataset, test_dataset, loss_fn, shots = sh, n_epoch=1) 
    weights_list +=[weights]
    loss_evol_list += [loss_evol]

I thought that detach was going to detach the tensor from the graph.
What is happening is a bit confusing. The init vector does not change during the first train,
Then when I get in the for loops (except for the first instance) it starts to take the same value of the current model.QC_optvar2., while I wanted it to keep its inital value in order to reinject it again and again. ( the line model.QC_optvar2.data = init becomes useless because the two tensor are already the same)

I tried .clone(), .clone().detach(), .data etc… and it never works…
The only solution which works is to convert the tensor to numpy and convert it again in tensor when I re inject it.

I’m a bit puzzled by that I would like to understand a bit better what is behind,

Thank you in advance

b

tom · July 23, 2019, 10:16pm

I would suggest to leave the code as is but:

Don’t use .data, I think it helps clearly understanding what’s going on if you don’t.
I think you want with torch.no_grad(): model.QC_optvar2.copy_(init) instead of the assignment, otherwise you are integrating init into the (new) graph.

Best regards

Thomas