Converting Tensorflow code to Pytorch help

Tupac · January 8, 2020, 6:50pm

I have this simple tensorflow code block, what is the equivalent in pytorch? I am stuck trying to code it. I have encountered multiple Runtime errors, due to the dimensions.
This is the tensorflow code:

conv1 = tf.nn.conv1d(x,f1,stride=1,padding="VALID")
conv1 = tf.nn.bias_add(conv1, b1)
conv1 = tf.nn.sigmoid(conv1)

p1 = tf.layers.average_pooling1d(conv1, pool_size=2, strides=2, padding='VALID')

conv2 = tf.nn.conv1d(p1,f2,stride=1,padding="VALID")
conv2 = tf.nn.bias_add(conv2, b2)
conv2 = tf.nn.sigmoid(conv2)
p2 = tf.layers.average_pooling1d(conv2, pool_size=2, strides=2, padding='VALID')

nn = tf.layers.Flatten()(p2)

fc1 = tf.add(tf.matmul(nn, n1), b3)
fc1 = tf.nn.sigmoid(fc1)
out = tf.add(tf.matmul(fc1, n2), b4)
out = tf.nn.softmax(out)

How can I implement the same in pytorch? The code below is what I tried, but the dimensions are messed up, I think due to the channel inputs.

class TwoLayerNet(torch.nn.Module):
    def __init__(self):
        super(TwoLayerNet,self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv2d(1,3, 3*8, kernel_size=5, stride=1),  
            nn.Sigmoid(),
            nn.AvgPool1d(kernel_size=2, stride=0))
        self.conv2 = nn.Sequential(
            nn.Conv1d(3*8, 12, kernel_size=5, stride=1),
            nn.Sigmoid(),
            nn.AvgPool1d(kernel_size=2, stride = 0))
        #self.drop_out = nn.Dropout()

        self.fc1 = nn.Linear(708, 732) 
        self.fc2 = nn.Linear(732, 4)

    def forward(self, x):
        out = self.conv1(x)
        out = self.conv2(out)
        out = out.reshape(out.size(0), -1)
        out = self.drop_out(out)
        out = self.fc1(out)
        out = self.fc2(out)
        return out

albanD · January 8, 2020, 7:50pm

Hi,

You seem to be mixing conv1d and conv2d. Also your conv2d definition has too many arguments I think.

Tupac · January 8, 2020, 8:05pm

Should it only be conv1d? Because I used all conv1d, but there was an error since the input was 4D

albanD · January 8, 2020, 8:07pm

If you input is of size [708, 256, 3], then you do want conv1d just like in the Tensorflow code.
Also the stride argumnets for the pooling are not the same…

Tupac · January 8, 2020, 8:09pm

Could you please guide on editing the code I posted? I’m really lost…

albanD · January 8, 2020, 8:11pm

The tensorflow code has strides=2 while the pytorch version has nn.AvgPool1d(kernel_size=2, stride=0). So change it to stride=2?

Tupac · January 8, 2020, 8:14pm

Ok thanks, but I’m still confused about the dimensions for the conv1d - what should be the input and output number? And the for the linear part too. I really appreciate your help on this!

albanD · January 8, 2020, 8:17pm

I think you main problem is that Tensorflow input has size [708, 256, 3] with [batch, width, channels]. While pytorch expect [batch, channels, width].
So you need to be careful to give an input of size [708, 3, 256] to the pytorch model. You should check the doc for these functions to make sure you’re doing the right thing.

Tupac · January 8, 2020, 8:24pm

Thank you so much. I have updated my code, I have a new error:

class CNN(torch.nn.Module):
    def __init__(self):
        super(CNN,self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv1d(708, 252, kernel_size=5, stride=1),  
            nn.Sigmoid(),
            nn.AvgPool1d(kernel_size=2, stride=2))
        self.conv2 = nn.Sequential(
            nn.Conv1d(252, 12, kernel_size=5, stride=1),
            nn.Sigmoid(),
            nn.AvgPool1d(kernel_size=2, stride=2))

        self.fc1 = nn.Linear(61, 12) 
        self.fc2 = nn.Linear(12, 4)

    def forward(self, x):
        out = self.conv1(x)
        out = self.conv2(out)
        out = self.fc1(out)
        out = self.fc2(out)
        return out

Error:
—> 24 loss_train = criterion(output_train, ytrain)
ValueError: Expected input batch_size (3) to match target batch_size (708).

These are the shapes. I am comparing them for the Cross Entropy Loss:
ytrain_shape torch.Size([708, 4])
Outputtrain shape torch.Size([3, 12, 4])

albanD · January 8, 2020, 8:32pm

You are missing the operation that correspond to nn = tf.layers.Flatten()(p2).
You want to add in your forward out = out.view(out.size(0), -1) beween the last convolution and the first fully connected.

Tupac · January 8, 2020, 9:12pm

Ok, I have updated the block as below:

class CNN(torch.nn.Module):
    def __init__(self):
        super(CNN,self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv1d(708, 252, kernel_size=5, stride=1),  
            nn.Sigmoid(),
            nn.AvgPool1d(kernel_size=2, stride=2))
        self.conv2 = nn.Sequential(
            nn.Conv1d(252, 12, kernel_size=5, stride=1),
            nn.Sigmoid(),
            nn.AvgPool1d(kernel_size=2, stride=2))

        self.fc1 = nn.Linear(732, 708) 
        self.fc2 = nn.Linear(708, 4)

    def forward(self, x):
        out = self.conv1(x)
        out = self.conv2(out)
        out = out.view(out.size(0), -1)
        out = self.fc1(out)
        out = self.fc2(out)
        return out

Now I get this error for the loss function:
ValueError: Expected input batch_size (3) to match target batch_size (708).

These are the dimensions, how do I ensure y_hat is (708,4)
ytrain_shape: torch.Size([708, 4])
Y_hat shape: torch.Size([3, 4])

albanD · January 8, 2020, 10:05pm

Could you give the full code sample please? I’m not sure what Y_hat is supposed to be here?
It looks like your input does not have the right dimensions as the first conv1d should have 3 input channels not 708. You want batch x channels x width for the input.

Tupac · January 8, 2020, 10:37pm

Here is the code (starting from x_train and y_train)

shape of x_train: torch.Size([3, 708, 256])
shape of y_train: torch.Size([708, 4]))

class CNN(torch.nn.Module):
    def __init__(self):
        super(CNN,self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv1d(708, 252, kernel_size=5, stride=1),  
            nn.Sigmoid(),
            nn.AvgPool1d(kernel_size=2, stride=2))
        self.conv2 = nn.Sequential(
            nn.Conv1d(252, 12, kernel_size=5, stride=1),
            nn.Sigmoid(),
            nn.AvgPool1d(kernel_size=2, stride=2))

        self.fc1 = nn.Linear(732, 708) 
        self.fc2 = nn.Linear(708, 4)

    def forward(self, x):
        out = self.conv1(x)
        out = self.conv2(out)
        out = out.view(out.size(0), -1)
        out = self.fc1(out)
        out = self.fc2(out)
        return out

# defining the model
model = CNN()
model = model.float()
# defining the optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.07)
# defining the loss function
criterion = CrossEntropyLoss()

# checking if GPU is available
if torch.cuda.is_available():
    model = model.cuda()
    criterion = criterion.cuda()
    
print(model)

def train(epoch):
    model.train()
    tr_loss = 0
    # getting the training set
    x_train, ytr= Variable(X_train), Variable(y_train)
    ytrain = ytr.to(dtype=torch.int64)
    # getting the validation set
    #x_val, y_val = Variable(val_x), Variable(val_y)
    # converting the data into GPU format
    
    print('ytrain_shape',ytrain.shape)

    # clearing the Gradients of the model parameters
    optimizer.zero_grad()
    
    # prediction for training and validation set
    output_train = model(x_train.float())
    #output_train = model(x_train)
    #output_train=output_train.squeeze(1)
    
    print('Y_hat shape',output_train.shape)


    # computing the training and validation loss
    loss_train = criterion(output_train, ytrain)
    

    train_losses.append(loss_train)


    # computing the updated weights of all the model parameters
    loss_train.backward()
    optimizer.step()
    tr_loss = loss_train.item()
    
    return train_losses

# defining the number of epochs
n_epochs = 5
# empty list to store training losses
train_losses = []
# empty list to store validation losses
val_losses = []
# training the model
for epoch in range(n_epochs):
    train(epoch)

albanD · January 8, 2020, 10:46pm

As mentionned above, the shape of x_train is wrong. It should be ([708, 3, 256])

Tupac · January 8, 2020, 10:51pm

But then I will get this error:
RuntimeError: Given groups=1, weight of size 252 708 5, expected input[708, 3, 256] to have 708 channels, but got 3 channels instead

albanD · January 8, 2020, 10:54pm

Your first conv parameters are not correct, the input has 3 channels, not 708.

Tupac · January 8, 2020, 10:58pm

See here, the thing is in TensorFlow, 708 is not the batch size. I have a 3-D input (3 channels, of 708 rows and 256 columns.

Anyway I have changed as you suggested, but I get this error now

RuntimeError: multi-target not supported at C:\w\1\s\tmp_conda_3.7_104508\conda\conda-bld\pytorch_1572950778684\work\aten\src\THNN/generic/ClassNLLCriterion.c:22
at

—> 25 loss_train = criterion(output_train, ytrain)

albanD · January 9, 2020, 2:41pm

The tensorflow conv1d takes as input batch x width x channel no? So your Tensorflow code is wrong?

Tupac · January 9, 2020, 6:45pm

Ok i got it, i changed as your suggestion. Thanks!