Summary Question
In what ways can a jit-scripted model be modified? The weights can be trained. Can constants also be changed?
Desired use-case
I am comparing methods for prediction uncertainty estimation. One of the commonly used baselines is Monte Carlo Dropout. To fairly evaluate the method, the dropout hyperparameters (the p
parameter) would ideally be tuned on a hold-out set of data.
I am using jit-scripted models as predictors, as it allows easy incorporation of different model architectures without the need of including the model source code in the project.
Enabling the dropout of a scripted model
Some modifications are exposed, e.g., the training
parameter of the dropout can be set to True
by running:
def enable_dropout(model):
for child_name, child in model.named_children():
if child.original_name.startswith('Dropout'):
child.train()
else:
enable_dropout(child)
model = torch.jit.load("model_gpu.pt")
enable_dropout(model)
However, changing the dropout probability, or replacing the dropout itself fails. See next section.
Detailed question: Changing the dropout probability
Similarly to setting the training
attribute, one might try to modify the p
attribute of all dropout layers in the network by running:
def set_dropout(model, p):
for child_name, child in model.named_children():
if child.original_name.startswith('Dropout'):
setattr(child, 'p', p)
else:
set_dropout(child, p)
model = torch.jit.load("model_gpu.pt")
set_dropout(model, 0.5)
This fails with RuntimeError: Can't set constant 'p' which has value: X
. Alternatively, one might try replacing the entire Dropout module:
def set_dropout(model, p):
for child_name, child in model.named_children():
if child.original_name.startswith('Dropout'):
setattr(model, child_name, torch.jit.script(nn.Dropout(p=p)))
else:
set_dropout(child, p)
model = torch.jit.load("model_gpu.pt")
set_dropout(model, 0.5)
This raises an error due to a mismatch in the Python compilation unit.
Can the desired change to the scripted model be achieved? What is the decision process for choosing which parameters are exposed to be modified and which ones are not? Is there any philosophy behind such a decision, or is it a technical constraint?
Thank you.