Dimension problems in 1dcnn

Sohaib_Mian · November 30, 2018, 3:52pm

Hi all,

I am trying to create a 1D convolution neural network for a NLP problem. I have implemented the same CNN1d with keras successfully however, I just cant seem to do it with pytorch.

why does Conv1d require a 3D input? how can I give a 2D input?
Maxpool1d does not lower the dimension from 3d to 2d, but just decreases the value of a dimension, why?
What is the difference between torch.nn and nn.Functional?
When I use functional, an error asks for tensor parameters (the inputs are tensors).

I am attahcing my code below, any help will be appreciated:

from sklearn.model_selection import train_test_split
Xtrain, Xtest, Ytrain, Ytest = train_test_split(X, Y, test_size=0.1, random_state=RANDOM_STATE)

df=pd.DataFrame(Xtrain)
de=pd.DataFrame(Ytrain)
dataset_x=df.values
dataset_y=de.values

x_data=torch.from_numpy(dataset_x)
y_data=torch.from_numpy(dataset_y)

print(x_data.shape)
print(y_data.shape)
print(x_data)
print(y_data)

torch.Size([90, 16])
torch.Size([90, 10])
tensor([[ 0,  0,  0,  ..., 27, 25, 35],
        [ 0,  0,  0,  ..., 20,  5,  7],
        [ 0,  0,  0,  ...,  7,  3,  4],
        ...,
        [ 0,  0,  0,  ..., 73, 41, 24],
        [ 0,  0,  0,  ...,  9, 41, 24],
        [ 0,  0,  0,  ..., 81, 48, 26]], dtype=torch.int32)
tensor([[0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
        [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
        [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.]])

class Model(torch.nn.Module):
    def __init__(self,vocab_size,embed_size):
        super(Model,self).__init__()
        self.embeddings = torch.nn.Embedding(vocab_size, embed_size)
        self.drop_out1 = torch.nn.Dropout(0.05)
        self.conv1= torch.nn.Conv1d(16,90,2)
        self.drop_out2 =torch.nn.Dropout(0.05)
        self.conv2= torch.nn.Conv1d(90, 20, 2)
        self.maxpool1=torch.nn.MaxPool1d(2)
        self.l1=torch.nn.Linear(49,14)
        self.l2=torch.nn.Linear(14,12)
        self.l3=torch.nn.Linear(12,10)
       
    def forward(self,x):
      
        x = self.embeddings(x)
        print(x.shape)
        x = self.drop_out1(x)
        x = F.relu(self.conv1(x))
        print(x.shape)
        x = self.drop_out2(x)
        x = F.relu(self.conv2(x))
        print(x.shape)
        x = self.maxpool1(x)
        print(x.shape)
        x = F.relu(self.l1(x))
        x = F.relu(self.l2(x))
      

        return F.sigmoid(self.l3(x))


model=Model(VOCAB_SIZE,EMBED_SIZE)

criterion=torch.nn.BCELoss()
optimizer=torch.optim.Adam(model.parameters(), lr=0.01)

for epoch in range(1):
    
        train = Variable(x_data).long()
        labels=Variable(y_data)
        
        
      
     
        
        optimizer.zero_grad()
        y_pred=model(train)
        print(y_pred.shape)
        print(y_pred[1])
        loss=criterion(y_pred,labels)
        
        loss.backward()
        optimizer.step()
        
        print("loss: ", loss)

ptrblck · November 30, 2018, 4:42pm

The nn.Conv1d uses a 3-dimensional input of the shape [batch_size, channels, length].
Each kernel has a specific kernel size and shifted through the temporal dimension (length in the example).
For each window, all input channels are used. This is similar to the vanilla Conv2d operation, where each kernel is moved through the spatial dimensions (height and width) and is using all input channels in the default setup.
If you just have a 2-dimensional input, I assume you are only using one channel.
In this case, you could just add the channel dimension and use your conv layer directly:

x_data = x.data.unsqueeze(1)

Currently your conv layer expects a 16-dimensional (channel-wise) input.

Max pooling is not supposed to lower the dimension. It uses kernels to draw the maximal value of the current overlapping window. Here is a small example:

x = torch.arange(10).float().unsqueeze(0).unsqueeze(1)  # Add batch and channel dims
pool = nn.MaxPool1d(kernel_size=2, stride=2)
output = pool(x)
print('Input: ', x)
> Input:  tensor([[[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]]])
print('Output: ', output)
> Output:  tensor([[[1., 3., 5., 7., 9.]]])

As you can see, we are using a kernel size of 2 and a stride of 2, which means the kernel windows are not overlapping.
In each window, the maximal value will be pooled.

You can use layers defined as classes like nn.Conv1d, which will hold all parameters as class members etc. or alternatively use the functional API, which is stateless so that you would have to store all parameters. Sometimes you need the flexibility the functional API provides, but for the most use cases the modules should work just fine.

As a small side note: Variables are deprecated since 0.4.0. If you are using a newer PyTorch version, you can just remove the Variable wrappings. You’ll find the install instructions here.