CNN outputs the same result even with different inputs

icarosadero · October 6, 2019, 2:56pm

Hello,

I am a beginner at neural networks and I have a project where I want to process text via a CNN. The input is basically a one-hot encoded tensor and the output is meant to be a single neuron with values ranging from -1 to 1. Just to illustrate what I am sending to this network, if T is the input tensor, a side by side composition of its layers in the form T[k,:,:] after the convolutions is in this link:

The height stands for the input channels and the width is the product between the number of samples and the features.

Everything looks fine so far until it passes to the perceptron, the output always is a vector with equal valued entries. Even after changing the convolutional layers, loss function and optimization method over and over I get the same results. The activation function I am using for the perceptron is a sigmoid function. The code I made for the network architecture is as follows:

class model(nn.Module):
    def __init__(self,channels):
        super(model, self).__init__()
        self.convolutions = nn.Sequential(
            nn.Conv1d(channels, 500, kernel_size=7, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=3),
            nn.Conv1d(500, 400, kernel_size=7, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=3),
            nn.Conv1d(400, 300, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv1d(300, 200, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv1d(200, 100, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=3),
            )
        self.perceptron = nn.Sequential(
            nn.Linear(100,50),
            nn.Sigmoid(),
            nn.Linear(50,16),
            nn.Sigmoid(),
            nn.Linear(16,1),
            nn.Sigmoid()
        )
        self.transfer = lambda x,y: nn.Linear(x,y)
    def forward(self, x):
        x = self.convolutions(x)
        x = x.view(x.size(0), -1)
        l = self.transfer(x.shape[1],self.perceptron[0].in_features)
        x = self.perceptron(l(x))
        return x

With all this said, I would like to know if I am messing up something either about the theory or the implementation. Thanks in advance.

phan_phan · October 6, 2019, 4:18pm

In the forward method:

l = self.transfer(x.shape[1], self.perceptron[0].in_features)
x = self.perceptron(l(x))

what you are doing is creating a new layer at each step … So this layer l cannot learn, since it is discarded and randomly reinitialized at each step.

The thing is, there is no variable-size nn.Linear layer!
Because this layer has a fixed number of parameters which are learned, and it would not be logical to remove some, or add new parameters at each step during training.

Is your input tensor x of variable size ?

icarosadero · October 6, 2019, 4:33pm

Thank you, I really didn’t know about that…

The input tensor x can vary from dataset to dataset but its dimensions stay constant on each run. Because of this, I managed to remove the l layer from the class entirely and instead set the input dimension on the first linear layer of the perceptron as the product of the last kernel size of the constitutional layer and the number of features. It seemed to still work but the network keeps getting stuck around the same values.

I also tried to run the same code in a different dataset and I still get the same results.

The code now is:

class model(nn.Module):
    def __init__(self,c_channels,char_size):
        super(model, self).__init__()
        self.last_kernel = 3
        self.p_channels = self.last_kernel*char_size
        self.convolutions = nn.Sequential(
            nn.Conv1d(c_channels, 500, kernel_size=7, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=3),
            nn.Conv1d(500, 400, kernel_size=7, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=3),
            nn.Conv1d(400, 300, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv1d(300, 200, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv1d(200, 100, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=self.last_kernel),
            )
        self.perceptron = nn.Sequential(
            nn.Linear(self.p_channels,50),
            nn.Sigmoid(),
            nn.Linear(50,16),
            nn.Sigmoid(),
            nn.Linear(16,1),
            nn.Sigmoid()
        )
        #self.transfer = lambda x,y: nn.Linear(x,y)
    def forward(self, x):
        x = self.convolutions(x)
        x = x.view(x.size(0), -1)
        print(x.shape)
        #l = self.transfer(x.shape[1],self.perceptron[0].in_features)
        #x = self.perceptron(l(x))
        x = self.perceptron(x)
        return x

ChenK19 · October 24, 2020, 1:59pm

Have you fix your problem? I face the same problem now, could you give me some advice?