Access weights of a specific module in nn.Sequential()

mbp28 · June 1, 2017, 2:29pm

Hi,

this should be a quick one, but I wasn’t able to figure it out myself.
When I use a pre-defined module in PyTorch, I can typically access its weights fairly easily.
However, how do I access them if I wrapped the module in nn.Sequential() first?
Please see toy example below.

class My_Model_1(nn.Module):
    def __init__(self,D_in,D_out):
        super(My_Model_1, self).__init__()
        self.layer = nn.Linear(D_in,D_out)
    def forward(self,x):
        out = self.layer(x)
        return out

class My_Model_2(nn.Module):
    def __init__(self,D_in,D_out):
        super(My_Model_2, self).__init__()
        self.layer = nn.Sequential(nn.Linear(D_in,D_out))
    def forward(self,x):
        out = self.layer(x)
        return out

model_1 = My_Model_1(10,10)
print(model_1.layer.weight)
model_2 = My_Model_2(10,10)
# How do I print the weights now?
# model_2.layer.0.weight doesn't work.

Many thanks.
Any help much appreciated.

smth · June 1, 2017, 9:19pm

model_2.layer[0].weight

mbp28 · June 4, 2017, 10:20am

Hi! Many thanks, this is what I was looking for. Was trying the wrong braces.

Vaijenath_Biradar · January 28, 2018, 7:20am

When I am doing this, the error i am getting is "The model has no attribute ‘layer’ ".

ptrblck · January 28, 2018, 2:36pm

@Vaijenath_Biradar
layer was defined in __init__:

self.layer = nn.Sequential(nn.Linear(D_in,D_out))

You have to use the variable name defined in your model.

amirbarghi · November 9, 2018, 5:47pm

Hi,

Is there any way in Pytorch to get access to the layers of a model and weights in each layer without typing the layer name. Something like model.layers in keras which is discussed in the following:

Rick_Sanchez · April 10, 2019, 6:12pm

When I using this method with model.eval() I getting different values of weights for the same example.
Is that mean that my model don’t work correctly ?

saba · December 10, 2019, 7:05am

Hi ptrblck

Happy to find you here.
I am building 2 CNN layers with 3 FC layers and using drop out two times.
My neural network is defined as follow: Do you see any thing wrong in that? I appreciate your feedback.

import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import TensorDataset, DataLoader
import torch.optim as optim
import torch.nn as nn
from torch.utils.data.dataset import random_split
from torch.nn import functional as F
import matplotlib.pyplot as plt
from torch.autograd import Variable

class ConvNetRedo1(nn.Module):
def init(self,numf1,numf2,fz1,fz2,nn2,nn3): # numf1( nnumber of filters first layer)numf2(nnumber of filters first layer)),fz1 kernel size(),fz2,nn2,nn3
super(ConvNetRedo1, self).init()
self.numf1=numf1
self.numf2=numf2
self.fz1=fz1
self.fz2=fz2
self.nn2=nn2
self.nn3=nn3
self.layer1 = nn.Sequential(nn.Conv3d(1, self.numf1, kernel_size=self.fz1, stride=1, padding=2),nn.ReLU(),nn.MaxPool3d(kernel_size=2, stride=2))
self.layer2 = nn.Sequential(nn.Conv3d(self.numf1,self.numf2, kernel_size=self.fz2, stride=1, padding=2),nn.ReLU(),nn.MaxPool3d(kernel_size=2, stride=2))
self.fc1 = nn.Linear(3072, self.nn2) ##3027
self.fc2 = nn.Linear( self.nn2, self.nn3) # FULLY CONNECTED LAYERS
self.fc3 = nn.Linear( self.nn3, 1) # FULLY CONNECTED LAYERS
self.relu = nn.ReLU() # Non-Linear ReLU Layer: max(0,x)
self.sigmoid = nn.Sigmoid()
self.drop_out1 = nn.Dropout(0.5)
self.drop_out2 = nn.Dropout(0.5)
self.Relu=nn.LeakyReLU(0.1, inplace=True)

def forward(self, x):


    x=x.unsqueeze(1).float()
    out = self.layer1(x)
    
    out = self.layer2(out)
    
    out = out.view(out.size(0), -1)
    
    out = self.fc1(out)
    out=self.drop_out1(out)
    out=self.relu(out)

    out = self.fc2(out)
    out=self.drop_out2(out)
    out=self.relu(out)
    
    out = self.fc3(out)
    out = self.sigmoid(out)
    return out

ptrblck · December 10, 2019, 6:03pm

I’m not sure, if the number of features are correctly defined without knowing the input shape, but skimming through the model definition, I cannot find any obvious mistakes.
Are you seeing any issues with the model?

saba · December 10, 2019, 10:29pm

Hi
Many thanks for your reply.
The size of the features are good. I just want to know the way of designing, is dropout used in the good step? Is it better to use after Relu?

ptrblck · December 11, 2019, 1:10am

It doesn’t matter if dropout is applied before or after the relu. I cannot see any obvious mistakes.

saba · December 11, 2019, 2:26am

Many thanks for your reply. Sorry I am transferring my model and data and labels to the GPU.I am not sure if I should transfer criterion and optimizer to the GPU or not?

I used them in this way

criterion = nn.BCELoss()

optimizer = torch.optim.Adam(model.parameters(), lr=.03)

I appreciate your help

Do u know any books or links which is usable?

S

ptrblck · December 11, 2019, 2:45am

You could transfer the criterion to the GPU just to avoid possible issues, but it shouldn’t be necessary for nn.BCELoss.
One minor advice: I would remove the last sigmoid in your model and use nn.BCEWithLogitsLoss instead, as it will be numerically more stable.

Check out “Deep Learning with PyTorch” by @lantiga, @elistevens, and @tom, which can be downloaded for free on the official website.
(It’s not the full book if I’m not mistaken, as it’s still work in progress )

saba · December 11, 2019, 2:58am

Means finished the model to
out = self.fc3(out)
and use nn.BCEWithLogitsLoss . it has the in-built sigmoid in it?

saba · December 11, 2019, 3:02am

def forward(self, x):


    x=x.unsqueeze(1).float()
    out = self.layer1(x)
    
    out = self.layer2(out)
    
    out = out.view(out.size(0), -1)
    
    out = self.fc1(out)
    out=self.drop_out1(out)
    out=self.relu(out)

    out = self.fc2(out)
    out=self.drop_out2(out)
    out=self.relu(out)
    
    out = self.fc3(out)

    return out

and 

criterion=nn.BCEWithLogitsLoss()    indeed the input of the this loss function should be the out put from linear layer ?

ptrblck · December 11, 2019, 3:05am

Yes, that’s the correct usage.

saba · December 11, 2019, 3:15am

Dear ptrblck

I read now the book that you suggest, I am really confused in some cases during training I saw (optimizer.zero_grad()) is used before getting output and applying model and in this book ( lantiga) it is after getting the output?!
which one is correct?

t_p = model(t_un, *params)
loss = loss_fn(t_p, t_c)
optimizer.zero_grad()
loss.backward()
optimizer.step( )

ptrblck · December 11, 2019, 3:17am

It depends on your coding style, but it should be called before loss.backward() in case you don’t want explicitly to accumulate the gradients (which is a valid use case, but not the usual work flow).
I personally try to add it right at the beginning of the training loop, since I think I remember it better, but I still forget it from time to time.

saba · December 11, 2019, 3:18am

I really appreciate your help

you are really helping me

saba · December 11, 2019, 6:20am

Dear ptrblck

I used the thing that we discussed but the outputs are more than 1 !!