Hi everyone! I’m currently exploring the possibility of encode a dynamic computational graph with pyTorch and I’m a little confused about what is happening to my “dynamic model”.
As far as I understand, it’s possible to create models where, as instance, the number of layers and/or neurons per layer can change ([reference]) using Python control-flow operators like loops or conditional statements. However, I cannot figure out what it’s happening to the learnable parameters in such dynamic graph.
Just to be clearer, consider this snippet.
Basically, at each forward pass (that is to say, for every batch) we randomly throw a “coin” that will let us lead to different architectures, namely with 0,1,2 or 3 hidden layers.
class DynamicNet(torch.nn.Module):
def __init__(self, D_in, H1, H2, D_out):
super(DynamicNet, self).__init__()
self.input_linear = torch.nn.Linear(D_in, H1)
self.middle_linear1 = torch.nn.Linear(H1, H2)
self.middle_linear2 = torch.nn.Linear(H2, H1)
self.middle_linear3 = torch.nn.Linear(H1, H1)
self.output_linear = torch.nn.Linear(H1, D_out)
def forward(self, x):
x = relu(self.input_linear(x))
coin = random.randint(0, 3)
if coin == 1:
x = relu(self.middle_linear1(x))
elif coin == 2:
x = relu(self.middle_linear1(x))
x = relu(self.middle_linear2(x))
elif coin == 3:
x = relu(self.middle_linear1(x))
x = relu(self.middle_linear2(x))
x = relu(self.middle_linear3(x))
else:
x = relu(self.output_linear(x))
return F.log_softmax(x, dim=1)
My doubts are the following:
-
Am I really exploiting pyTorch dynamic graph capability? From my perspective, I’m basically creating a tree-like structure where we are assigning some probability to fall in one branch or another
-
How the weights matrices are updated?
-
How will look the final model that I eventually will save for future use?
-
What is the answer of the previous three questions in this second case?
class DynamicNet(torch.nn.Module):
def __init__(self, D_in, H, D_out):
super(DynamicNet, self).__init__()
self.input_linear = torch.nn.Linear(D_in, H)
self.middle_linear = torch.nn.Linear(H, H)
self.output_linear = torch.nn.Linear(H, D_out)
def forward(self, x):
x = relu(self.input_linear(x))
coin = random.randint(0, 3)
for _ in range(coin):
x = relu(self.middle_linear(x))
x = relu(self.output_linear(x))
return F.log_softmax(x, dim=1)
Thanks a lot for your answers in adavanced