Hello everyone,
In order to train and visualize some optimization I’m working on I adopt the following method:
-I define a model,
-save the value of 2 parameters,
-train the model,
-put the requires_grad to flase for all my parameters except the 2 I have saved,
-then for different optimization mode I put again the two parameters to their initial value and retrain the model.
Here is the following code:
init = model.QC_optvar2.clone().detach() # I save the init value
print('init', init)
# remaining hyperparameters
loss_fn = torch.nn.CrossEntropyLoss()
n_epoch = 1
#getting to an optimal point by training all the parameters
train_losses, valid_losses, valid_acc = train(model, train_dataset, test_dataset, loss_fn, n_epoch)
weights_list = []
loss_evol_list = []
shots = [0, 3, 256, 1024]
for sh in shots:
print('shots', sh, 'init', init)
print(model.QC_optvar2.data)
model.QC_optvar2.data = init
print(model.QC_optvar2.data)
model.QC_optvar1.requires_grad_(False)
model.QE_optvar.requires_grad_(False)
print(sh, 'shots')
weights, loss_evol = trainDSGD(model, train_dataset, test_dataset, loss_fn, shots = sh, n_epoch=1)
weights_list +=[weights]
loss_evol_list += [loss_evol]
I thought that detach was going to detach the tensor from the graph.
What is happening is a bit confusing. The init vector does not change during the first train,
Then when I get in the for loops (except for the first instance) it starts to take the same value of the current model.QC_optvar2.
, while I wanted it to keep its inital value in order to reinject it again and again. ( the line model.QC_optvar2.data = init becomes useless because the two tensor are already the same)
I tried .clone(), .clone().detach(), .data etc… and it never works…
The only solution which works is to convert the tensor to numpy and convert it again in tensor when I re inject it.
I’m a bit puzzled by that I would like to understand a bit better what is behind,
Thank you in advance
b