Creating a separate optimizer for a new class object

Hi, I have a generator class and a class for something that is doing a simple math operation on the output of one of the intermediate outputs from the generator layer

So what this looks like is:

First few generator layers
.
.
.
math operation
.
.
other generator layers

When I create the optimizer for the generator, the parameters for my math operation object are passed as one of the generator parameters.

I want to be able to create a separate optimizer for this math operation layer and call that during training. I tried applying an instance of the math object within the generator definition but that gave me an error.

Since the generator is wrapped in a Sequential container, I can’t assign any values inside the definition -

self.mathlayer1 = x + b

gives me an error.

So how can I define the generator in a way that will let me create a different optimizer for the novel layer?

If you could provide executable code which shows what you want, then it would be easier to figure out the issue. If you could make the code as simple as you can while still illustrating the question, that would help, too.

In the example code you could, for instance, have the math operation sandwiched between just two generator layers (I am not sure what these are, from your description …), if such a setup will also illustrate the issue.

Yes of course, my apologies!

My generator code is as follows:

class Generator(nn.Module):
    def __init__(self, ngpu, b_val):
        super(Generator, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is Z, going into a convolution
            nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            nn.ReLU(True),
            # state size. (ngf*8) x 4 x 4
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            nn.ReLU(True),
            # state size. (ngf*4) x 8 x 8
            nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            nn.ReLU(True),
            # state size. (ngf*2) x 16 x 16
            nn.ConvTranspose2d( ngf * 2, ngf, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf),
            ShiftLayer(b_val),
            nn.ReLU(True),
            # state size. (ngf) x 32 x 32
            nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
            nn.Tanh()
            # state size. (nc) x 64 x 64
        )
    
    def forward(self, input):
        return self.main(input)

The layer called ShiftLayer is defined as:

class ShiftLayer(nn.Module):
    def __init__(self, b):
        super().__init__()
        self.b = nn.Parameter(torch.tensor(1.))
        
    def forward(self, x):
        return x + self.b

During training, I have an optimizer for the generator. Since b is a parameter defined within the generator, when I execute

optimizerG = optim.Adam(generator.parameters(), lr=lr, betas=(beta1, 0.999))

b is returned as one of these generator.parameters(). I want to create a separate optimizer for optimizing b so that I can do something like:

optimizerb = optim.SGD(ShiftLayer.parameters(), . . .)

How can I do this?

(Sorry about the long post, I am not sure how to make it more concise without getting too abstract)

@ptrblck sorry for the tag, would you have any idea?

Your ShiftLayer code just ignores the value of b passed in to its constructor. Is this what you want to do?

No, I want to create an optimizer for b only while using ShiftLayer in my generator

As your code currently is:

class ShiftLayer(nn.Module):
    def __init__(self, b):
        super().__init__()
        self.b = nn.Parameter(torch.tensor(1.))
        
    def forward(self, x):
        return x + self.b

the value of b passed in to ___init___ is not used at all. So it is as good as not passing in this value. Perhaps you meant to say

self.b = nn.Parameter(torch.tensor(b))

?

What are some valid values for the shapes of nz , ngf, and nc ? If you tell me these, I could try figuring out how to optimize for b.

nz = 100
ngf = 64
nc = 3

I am just trying to create a built in optimizer for b using torch.optim, from the feedback in my previous posts, for b to be optimized, I had to add self.b = nn.Parameter(torch.tensor(b)) in my definition of ShiftLayer().

As my code is right now, b is being passed as a parameter when I instantiate my generator and then call

optimizerG = optim.Adam(generator.parameters(), lr=lr, betas=(beta1, 0.999))

All I’m trying to do is instead of optimizing b through the Generator’s optimizer, create an optimizer for itself using torch.optim

Hope that makes sense.

See if passing the b_param retrieved as follows, to the optimizer, works:

x = torch.randn(1,10) # Dummy value!
ngpu = True # Dummy value!
nz = 100
ngf = 64
nc = 3
myGen = Generator(True, x)
shiftLayer = myGen.main[11] # I got the 11 by looking at the output of print(myGen) 
_, b_param = next(shiftLayer.named_parameters())
print(b_param)

Seems like i am getting an error:

 File "mnist_film.py", line 299, in <module>              
 _, b_param = next(shiftLayer.named_parameters())                                                                               
StopIteration   

StopIteration means there are no more things for next to return from its argument.

Are you running this line within a loop? This won’t work the second time, since shiftLayer has only one named parameter.

Edited to add: If you need to have this line within a loop, you can replace it with

_, b_param = list(shiftLayer.named_parameters())[0]

Nope it is not inside a loop! I think for some reason shiftlayer.named_parameters() is not returning a value

Edit: I printed all of the parameters of the generator object and it seems like b is not considered a parameter. Confused about what’s going on

I don’t know why this would happen, as b is returned as a parameter in all of these implementations:

class ShiftLayer(nn.Module):
    def __init__(self, b):
        super().__init__()
        self.b = nn.Parameter(torch.tensor(b))
        
    def forward(self, x):
        return x + self.b
    
shift = ShiftLayer(1.)
print(list(shift.parameters()))
print(dict(shift.named_parameters()))

model = nn.Sequential(
    nn.Linear(1, 1),
    ShiftLayer(2.)
)
print(list(model.parameters()))
print(dict(model.named_parameters()))



class Generator(nn.Module):
    def __init__(self, b_val):
        super(Generator, self).__init__()
        self.main = nn.Sequential(
            # input is Z, going into a convolution
            nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            nn.ReLU(True),
            # state size. (ngf*8) x 4 x 4
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            nn.ReLU(True),
            # state size. (ngf*4) x 8 x 8
            nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            nn.ReLU(True),
            # state size. (ngf*2) x 16 x 16
            nn.ConvTranspose2d( ngf * 2, ngf, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf),
            ShiftLayer(b_val),
            nn.ReLU(True),
            # state size. (ngf) x 32 x 32
            nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
            nn.Tanh()
            # state size. (nc) x 64 x 64
        )
    
    def forward(self, input):
        return self.main(input)

nz = 100
ngf = 64
nc = 3
gen = Generator(10.)

for name, param in gen.named_parameters():
    if 'b' in name and 'bias' not in name:
        print(name, param)
> main.11.b Parameter containing:
  tensor(10., requires_grad=True)

That’s what I am thinking. I am going to double check my code again, maybe I left something there by mistake like last time

Hey @gphilip , seems like I’ve fixed my issues. This is my final solution:

shift_layer = generator.main[11]
optimizer_b = optim.SGD(shift_layer.parameters(), lr = ...)

and in the training loop I just call optimizer_b.step() for every epoch.

So far this seems to be working! Thanks to you both for your help. Appreciate it!

1 Like