Net structure is confusing

vivivo · December 23, 2018, 3:27am

Hi, guys. I faced some problem in pytorch. My code like this:

class Net(nn.Module):
    __init__(self, n_inputs, n_outputs, kernel_size, stride, dilation, padding, dropout=0.2):
        self.conv1 = weight_norm(nn.Conv1d(n_inputs, n_outputs, kernel_size, stride=stride, padding=padding, dilation=dilation, bias=False))
        self.chomp1 = Chomp1d(padding)
        self.bn1 = nn.BatchNorm1d(n_outputs)
        self.relu1 = nn.ReLU()
        self.dropout1 = nn.Dropout(dropout)
        self.conv2 = weight_norm(nn.Conv1d(n_outputs, n_outputs, kernel_size, stride=stride, padding=padding, dilation=dilation, bias=False))
        self.chomp2 = Chomp1d(padding)
        self.bn2 = nn.BatchNorm1d(n_outputs)
        self.relu2 = nn.ReLU()
        self.dropout2 = nn.Dropout(dropout)    

        self.net = nn.Sequential(
            self.conv1, self.chomp1, self.bn1, self.relu1, self.dropout1,
            self.conv2, self.chomp2, self.bn2,
            self.relu2, self.dropout2)
        self.downsample = nn.Conv1d(n_inputs, n_outputs, 1) if n_inputs != n_outputs else None
        self.relu = nn.ReLU()
        self.init_weight()

    def init_weight(self):
        self.conv1.weight.data.normal_(0, 0.01)
        self.conv2.weight.data.normal_(0, 0.01)
        if self.downsample is not None:
            self.downsample.weight.data.normal_(0, 0.01)

    def forward(self, x):
        out = self.net(x)
        res = x if self.downsample is None else self.downsample(x)
        return self.relu(out+res)

And I just delete Sequential, change forward function like this:

def forward(self, x):
    out = self.dropout1(self.relu1(self.bn1(self.chomp1(self.conv1(x)))))
    out = self.dropout2(self.relu2(self.bn2(self.chomp2(self.conv2(out)))))
    res = x if self.downsample is None else self.downsample(x)

Something wrong happened, training accuracy falling a lot, and model size is very small. I just can’t figure it out. Thanks.

smth · December 23, 2018, 4:23am

did your training accuracy drop / fail a lot after you removed the Sequential?

I looked at your code and didn’t see anything obvious.
Some things to look for are to see if the number of parameters in your model is the same before and after. You can do this with:

num_params = 0
for p in model.parameters():
    num_params += p.numel()
print(num_params)

vivivo · December 23, 2018, 4:38am

Thanks for your reply. I check the model parameters before changed, total num is 65025 after is 2736. And yes, my training accuracy drop a lot and stuck in 0.502 after removed the Sequential.

ptrblck · December 23, 2018, 2:45pm

I tried to reproduce the error and couldn’t using some random values as model arguments.
Also I had to remove the chomp layers, since the implementation is missing. Could you post the code for these layers and the arguments you are using?

vivivo · December 24, 2018, 2:25am

Thank you. I found what’s wrong here, after smth reply, I carefully check my model and code. just some Hyperparameter problem, yeah, I give a wrong nhid size. lol. Anyway, Thanks for your help, this topic should be close.