In order to train and visualize some optimization I’m working on I adopt the following method:
-I define a model,
-save the value of 2 parameters,
-train the model,
-put the requires_grad to flase for all my parameters except the 2 I have saved,
-then for different optimization mode I put again the two parameters to their initial value and retrain the model.
Here is the following code:
init = model.QC_optvar2.clone().detach() # I save the init value print('init', init) # remaining hyperparameters loss_fn = torch.nn.CrossEntropyLoss() n_epoch = 1 #getting to an optimal point by training all the parameters train_losses, valid_losses, valid_acc = train(model, train_dataset, test_dataset, loss_fn, n_epoch) weights_list =  loss_evol_list =  shots = [0, 3, 256, 1024] for sh in shots: print('shots', sh, 'init', init) print(model.QC_optvar2.data) model.QC_optvar2.data = init print(model.QC_optvar2.data) model.QC_optvar1.requires_grad_(False) model.QE_optvar.requires_grad_(False) print(sh, 'shots') weights, loss_evol = trainDSGD(model, train_dataset, test_dataset, loss_fn, shots = sh, n_epoch=1) weights_list +=[weights] loss_evol_list += [loss_evol]
I thought that detach was going to detach the tensor from the graph.
What is happening is a bit confusing. The init vector does not change during the first train,
Then when I get in the for loops (except for the first instance) it starts to take the same value of the current
model.QC_optvar2., while I wanted it to keep its inital value in order to reinject it again and again. ( the line model.QC_optvar2.data = init becomes useless because the two tensor are already the same)
I tried .clone(), .clone().detach(), .data etc… and it never works…
The only solution which works is to convert the tensor to numpy and convert it again in tensor when I re inject it.
I’m a bit puzzled by that I would like to understand a bit better what is behind,
Thank you in advance