Inferring shape via flatten operator

domluna · January 22, 2017, 9:41pm

Is there a flatten-like operator to calculate the shape of a layer output. An example would be transitioning from a conv layer to linear layer. In all the examples I’ve seen thus far this seems to be manually calculated, ex:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = F.relu(self.fc2(x))
        return F.log_softmax(x)

What would be idiomatic torch to calculate 320?

fmassa · January 22, 2017, 10:00pm

You can use -1 in view so that the remaining dimension is automatically calculated, but instead of using it in the first dimension, which you know and is the batch size, you use it the the other one.
For example

bs = 5
x = torch.rand(bs, 3, 224, 224)
x = x.view(x.size(0), -1)

domluna · January 23, 2017, 1:25am

Thanks, that works for the forward method but I’m more concerned with the network definition, self.fc1 = nn.Linear(320, 50). How do I calculate the 320 there using torch?

fmassa · January 23, 2017, 9:41am

Hum, I’m afraid you can’t calculate that in __init__ without prior knowledge of the input shape.
You could imagine passing an input shape as argument to the __init__. In this situation, you can infer the shape by performing a forward pass over the convolutional blocks. Something like

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable

class Net(nn.Module):
    def __init__(self, input_shape=(1, 28, 28)):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()

        n_size = self._get_conv_output(input_shape)
        
        self.fc1 = nn.Linear(n_size, 50)
        self.fc2 = nn.Linear(50, 10)

    # generate input sample and forward to get shape
    def _get_conv_output(self, shape):
        bs = 1
        input = Variable(torch.rand(bs, *shape))
        output_feat = self._forward_features(input)
        n_size = output_feat.data.view(bs, -1).size(1)
        return n_size

    def _forward_features(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        return x

    def forward(self, x):
        x = self._forward_features(x)
        x = x.view(x.size(0), -1)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = F.relu(self.fc2(x))
        return F.log_softmax(x)

ypxie · February 2, 2017, 3:40am

you can try functional api which is truly dynamic.

ncullen93 · March 2, 2017, 4:14pm

Hm yeah I did something similar. You definitely have to know the input size. The only other option is to write a general function which calculates the shape using conv rules without having to actually run the graph…

class Network(nn.Module):
    
    def __init__(self, input_size=(3,32,32)):
        super(Network, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3,32,3),
            nn.ReLU(),
            nn.Conv2d(32,32,3),
            nn.ReLU(),
            nn.MaxPool2d((3,3))
        )
        self.flat_fts = self.get_flat_fts(input_size, self.features)
        self.classifier = nn.Sequential(
            nn.Linear(self.flat_fts, 100),
            nn.Dropout(p=0.2),
            nn.ReLU(),
            nn.Linear(100,10),
            nn.LogSoftmax()
        )
    
    def get_flat_fts(self, in_size, fts):
        f = fts(Variable(torch.ones(1,*in_size)))
        return int(np.prod(f.size()[1:]))
    
    def forward(self, x):
        fts = self.features(x)
        flat_fts = fts.view(-1, self.flat_fts)
        out = self.classifier(flat_fts)
        return out

fmassa · March 2, 2017, 4:21pm

@ncullen93 when you do fts(Variable(torch.ones(1,*in_size))), you are actually performing a forward pass on your network.

ncullen93 · March 2, 2017, 4:33pm

@fmassa right… That’s what you do in your example as well, I think? I’m saying it would be nice to not have to do that.

fmassa · March 2, 2017, 4:45pm

Yes, that’s what I do in my example.

Note that you can add a global operation (like global max/average pooling) just before your view layer, so that you know precisely the number of inputs that the linear layer will receive (as you can see in the resnet model definition, where the kernel size for the pooling can be computed on the fly using the functional interface).

We could eventually add another method to each Function that, given an input shape and a set of parameters, returns an output shape, but is it really worth it?

@apaszke what do you think?

apaszke · March 2, 2017, 8:10pm

@fmassa we’ll need that for lazy execution, but I don’t know what the API’s going to look like yet.

apaszke · March 2, 2017, 8:11pm

Actually no, we don’t need that. I think it’s too much work and maintenance for very small gains. You probably want to do that extra forward once.

abhigenie92 · July 10, 2017, 4:13am

This should be a feature, seems weird to do a forward pass.

xueliang_liu · March 23, 2018, 12:44am

Keras and Lasagne can do it very easily.

parichehr · June 3, 2018, 2:45pm

Can you please tell me what is bs here?

lkhphuc · June 8, 2018, 10:04am

bs is for batch_size, which is then below refered to as x.size(0).

Xingdong_Zuo · June 14, 2018, 8:32am

@apaszke If I got it correctly, the additional forward pass does not have any effect on the gradient computation in later training phase right ? And will the memory be release automatically (since we do not do backward pass, the node will not be diminished ) ?

mlearnx · February 8, 2019, 10:12am

guys any update about this feature?

I also wonder to know the answer to Xingdong_Zuo question.

Thanks!

iacob · July 7, 2021, 11:11am

As of 1.8, PyTorch now has LazyLinear which infers the input dimension:

A torch.nn.Linear module where in_features is inferred.