Creating a separate optimizer for a new class object

avalon1511 · September 15, 2021, 3:31am

Hi, I have a generator class and a class for something that is doing a simple math operation on the output of one of the intermediate outputs from the generator layer

So what this looks like is:

First few generator layers
.
.
.
math operation
.
.
other generator layers

When I create the optimizer for the generator, the parameters for my math operation object are passed as one of the generator parameters.

I want to be able to create a separate optimizer for this math operation layer and call that during training. I tried applying an instance of the math object within the generator definition but that gave me an error.

Since the generator is wrapped in a Sequential container, I can’t assign any values inside the definition -

self.mathlayer1 = x + b

gives me an error.

So how can I define the generator in a way that will let me create a different optimizer for the novel layer?

gphilip · September 15, 2021, 3:47am

If you could provide executable code which shows what you want, then it would be easier to figure out the issue. If you could make the code as simple as you can while still illustrating the question, that would help, too.

In the example code you could, for instance, have the math operation sandwiched between just two generator layers (I am not sure what these are, from your description …), if such a setup will also illustrate the issue.

avalon1511 · September 15, 2021, 4:51pm

Yes of course, my apologies!

My generator code is as follows:

class Generator(nn.Module):
    def __init__(self, ngpu, b_val):
        super(Generator, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is Z, going into a convolution
            nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            nn.ReLU(True),
            # state size. (ngf*8) x 4 x 4
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            nn.ReLU(True),
            # state size. (ngf*4) x 8 x 8
            nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            nn.ReLU(True),
            # state size. (ngf*2) x 16 x 16
            nn.ConvTranspose2d( ngf * 2, ngf, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf),
            ShiftLayer(b_val),
            nn.ReLU(True),
            # state size. (ngf) x 32 x 32
            nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
            nn.Tanh()
            # state size. (nc) x 64 x 64
        )
    
    def forward(self, input):
        return self.main(input)

The layer called ShiftLayer is defined as:

class ShiftLayer(nn.Module):
    def __init__(self, b):
        super().__init__()
        self.b = nn.Parameter(torch.tensor(1.))
        
    def forward(self, x):
        return x + self.b

During training, I have an optimizer for the generator. Since b is a parameter defined within the generator, when I execute

optimizerG = optim.Adam(generator.parameters(), lr=lr, betas=(beta1, 0.999))

b is returned as one of these generator.parameters(). I want to create a separate optimizer for optimizing b so that I can do something like:

optimizerb = optim.SGD(ShiftLayer.parameters(), . . .)

How can I do this?

(Sorry about the long post, I am not sure how to make it more concise without getting too abstract)

@ptrblck sorry for the tag, would you have any idea?

gphilip · September 15, 2021, 5:09pm

Your ShiftLayer code just ignores the value of b passed in to its constructor. Is this what you want to do?

avalon1511 · September 15, 2021, 5:11pm

No, I want to create an optimizer for b only while using ShiftLayer in my generator

gphilip · September 15, 2021, 5:14pm

As your code currently is:

class ShiftLayer(nn.Module):
    def __init__(self, b):
        super().__init__()
        self.b = nn.Parameter(torch.tensor(1.))
        
    def forward(self, x):
        return x + self.b

the value of b passed in to ___init___ is not used at all. So it is as good as not passing in this value. Perhaps you meant to say

self.b = nn.Parameter(torch.tensor(b))

?

gphilip · September 15, 2021, 5:24pm

What are some valid values for the shapes of nz , ngf, and nc ? If you tell me these, I could try figuring out how to optimize for b.

avalon1511 · September 15, 2021, 5:31pm

nz = 100
ngf = 64
nc = 3

I am just trying to create a built in optimizer for b using torch.optim, from the feedback in my previous posts, for b to be optimized, I had to add self.b = nn.Parameter(torch.tensor(b)) in my definition of ShiftLayer().

As my code is right now, b is being passed as a parameter when I instantiate my generator and then call

optimizerG = optim.Adam(generator.parameters(), lr=lr, betas=(beta1, 0.999))

All I’m trying to do is instead of optimizing b through the Generator’s optimizer, create an optimizer for itself using torch.optim

Hope that makes sense.

gphilip · September 15, 2021, 5:42pm

See if passing the b_param retrieved as follows, to the optimizer, works:

x = torch.randn(1,10) # Dummy value!
ngpu = True # Dummy value!
nz = 100
ngf = 64
nc = 3
myGen = Generator(True, x)
shiftLayer = myGen.main[11] # I got the 11 by looking at the output of print(myGen) 
_, b_param = next(shiftLayer.named_parameters())
print(b_param)

avalon1511 · September 15, 2021, 6:44pm

Seems like i am getting an error:

 File "mnist_film.py", line 299, in <module>              
 _, b_param = next(shiftLayer.named_parameters())                                                                               
StopIteration

gphilip · September 15, 2021, 7:00pm

StopIteration means there are no more things for next to return from its argument.

Are you running this line within a loop? This won’t work the second time, since shiftLayer has only one named parameter.

Edited to add: If you need to have this line within a loop, you can replace it with

_, b_param = list(shiftLayer.named_parameters())[0]

avalon1511 · September 15, 2021, 7:51pm

Nope it is not inside a loop! I think for some reason shiftlayer.named_parameters() is not returning a value

Edit: I printed all of the parameters of the generator object and it seems like b is not considered a parameter. Confused about what’s going on

ptrblck · September 15, 2021, 8:06pm

I don’t know why this would happen, as b is returned as a parameter in all of these implementations:

class ShiftLayer(nn.Module):
    def __init__(self, b):
        super().__init__()
        self.b = nn.Parameter(torch.tensor(b))
        
    def forward(self, x):
        return x + self.b
    
shift = ShiftLayer(1.)
print(list(shift.parameters()))
print(dict(shift.named_parameters()))

model = nn.Sequential(
    nn.Linear(1, 1),
    ShiftLayer(2.)
)
print(list(model.parameters()))
print(dict(model.named_parameters()))



class Generator(nn.Module):
    def __init__(self, b_val):
        super(Generator, self).__init__()
        self.main = nn.Sequential(
            # input is Z, going into a convolution
            nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            nn.ReLU(True),
            # state size. (ngf*8) x 4 x 4
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            nn.ReLU(True),
            # state size. (ngf*4) x 8 x 8
            nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            nn.ReLU(True),
            # state size. (ngf*2) x 16 x 16
            nn.ConvTranspose2d( ngf * 2, ngf, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf),
            ShiftLayer(b_val),
            nn.ReLU(True),
            # state size. (ngf) x 32 x 32
            nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
            nn.Tanh()
            # state size. (nc) x 64 x 64
        )
    
    def forward(self, input):
        return self.main(input)

nz = 100
ngf = 64
nc = 3
gen = Generator(10.)

for name, param in gen.named_parameters():
    if 'b' in name and 'bias' not in name:
        print(name, param)
> main.11.b Parameter containing:
  tensor(10., requires_grad=True)

avalon1511 · September 15, 2021, 8:12pm

That’s what I am thinking. I am going to double check my code again, maybe I left something there by mistake like last time

avalon1511 · September 15, 2021, 9:22pm

Hey @gphilip , seems like I’ve fixed my issues. This is my final solution:

shift_layer = generator.main[11]
optimizer_b = optim.SGD(shift_layer.parameters(), lr = ...)

and in the training loop I just call optimizer_b.step() for every epoch.

So far this seems to be working! Thanks to you both for your help. Appreciate it!