Hi all,
I’m very new to PyTorch but I’m trying to implement a network which is a bit tricky.
Concretely speaking, the network has these four particularities:
First, it consists of several similar units, such that each unit takes the input and provides two outputs: one to be fed to the next unit and the other one directly fed to the final loss-function of the network. This means that I cannot use the standard nn.Sequntial
to define my network architecture.
Second, it has variable number of units. So I want to construct the network in a for-loop somewhere.
Third, each unit consists of operations not in the nn
class, like e.g., nn.Linear
or nn.Relu
. So I have to define also the units myself.
Forth, I need to initialize my parameters from some other functions. So all the initial parameters of all the units are gathered in e.g. a list and fed to the network in one place.
Suppose I want to use torch.optim
. So I should find a good way to declare my gradient-requiring variables to it. But this is where things are becoming problematic for me.
The way I see this network to be implemented is to define one class for the units that inherits from nn.Module
and one class for the the whole network that connects the units together.
My current code looks like this:
import torch
from torch.nn.parameter import Parameter
from torch.autograd import Variable
from torch import nn
#####
myParametersList = [torch.randn(2,2),
torch.randn(2,2),
torch.randn(2,2)]
input = Variable(torch.randn(2,2))
########
class myUnit(nn.Module):
"""
Defines a generic unit of the network
"""
def __init__(self,myParameter):
super(myUnit, self).__init__()
self.myParameter = Parameter(myParameter,requires_grad=True)
def forward(self,input):
"""
Whatever operation. Just an example.
"""
output_1 = self.myParameter * input - 1
output_2 = output_1 - output_1.mean()
return output_1,output_2
#######
class myNetwork(nn.Module):
"""
Uses myUnit class to build-up the network.
"""
def __init__(self,myParametersList,numUnits):
super(myNetwork, self).__init__()
self.myParametersList = myParametersList
self.numUnits = numUnits
assert numUnits == len(myParametersList)
def forward(self,input):
output_final = Variable(torch.zeros(2,2))
for u in range(self.numUnits):
myParameter = self.myParametersList[u]
unitObj = myUnit(myParameter)
output_1, output_2 = unitObj.forward(input)
input = output_1.clone() # to be fed to the next unit
output_final.add_(output_2)
return output_final
##################
#myModel = myUnit(myParametersList[0])
myModel = myNetwork(myParametersList,3)
myModel.forward(input) # I need this to create the list of my parameters.
optimizer = torch.optim.Adam(myModel.parameters(), lr=1e-2)
for t in range(50):
output_final = myModel.forward(input)[0]
loss = (input - output_final).pow(2).mean()
print(t, loss.data[0])
optimizer.zero_grad()
loss.backward()
optimizer.step
But I get this error about the list of parameters.
ValueError: optimizer got an empty parameter list
However, when I build the model only from one single myUnit
, I don’t get this error anymore.
I am also aware of this post, but that didn’t help me.
Any thoughts? I’d appreciate a lot!