I’m very new to PyTorch but I’m trying to implement a network which is a bit tricky.
Concretely speaking, the network has these four particularities:
First, it consists of several similar units, such that each unit takes the input and provides two outputs: one to be fed to the next unit and the other one directly fed to the final loss-function of the network. This means that I cannot use the standard
nn.Sequntial to define my network architecture.
Second, it has variable number of units. So I want to construct the network in a for-loop somewhere.
Third, each unit consists of operations not in the
nn class, like e.g.,
nn.Relu. So I have to define also the units myself.
Forth, I need to initialize my parameters from some other functions. So all the initial parameters of all the units are gathered in e.g. a list and fed to the network in one place.
Suppose I want to use
torch.optim. So I should find a good way to declare my gradient-requiring variables to it. But this is where things are becoming problematic for me.
The way I see this network to be implemented is to define one class for the units that inherits from
nn.Module and one class for the the whole network that connects the units together.
My current code looks like this:
import torch from torch.nn.parameter import Parameter from torch.autograd import Variable from torch import nn ##### myParametersList = [torch.randn(2,2), torch.randn(2,2), torch.randn(2,2)] input = Variable(torch.randn(2,2)) ######## class myUnit(nn.Module): """ Defines a generic unit of the network """ def __init__(self,myParameter): super(myUnit, self).__init__() self.myParameter = Parameter(myParameter,requires_grad=True) def forward(self,input): """ Whatever operation. Just an example. """ output_1 = self.myParameter * input - 1 output_2 = output_1 - output_1.mean() return output_1,output_2 ####### class myNetwork(nn.Module): """ Uses myUnit class to build-up the network. """ def __init__(self,myParametersList,numUnits): super(myNetwork, self).__init__() self.myParametersList = myParametersList self.numUnits = numUnits assert numUnits == len(myParametersList) def forward(self,input): output_final = Variable(torch.zeros(2,2)) for u in range(self.numUnits): myParameter = self.myParametersList[u] unitObj = myUnit(myParameter) output_1, output_2 = unitObj.forward(input) input = output_1.clone() # to be fed to the next unit output_final.add_(output_2) return output_final ################## #myModel = myUnit(myParametersList) myModel = myNetwork(myParametersList,3) myModel.forward(input) # I need this to create the list of my parameters. optimizer = torch.optim.Adam(myModel.parameters(), lr=1e-2) for t in range(50): output_final = myModel.forward(input) loss = (input - output_final).pow(2).mean() print(t, loss.data) optimizer.zero_grad() loss.backward() optimizer.step
But I get this error about the list of parameters.
ValueError: optimizer got an empty parameter list
However, when I build the model only from one single
myUnit, I don’t get this error anymore.
I am also aware of this post, but that didn’t help me.
Any thoughts? I’d appreciate a lot!