i want to ask u,pass hidden layer or input layer in view function?(x.view(-1,64))
and i am randomly choose the hidden layer and input layers? where as i have 3064 images with 512x512 dimensions.
i am asking about this:
self.fc1 = nn.Linear(262144, 64)
self.fc2 = nn.Linear(64, 3)
#self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1,64)
Usually you use a line of code like x = x.view(-1, 64) to create a new view of the tensor x, so that for example you can feed the output of a conv layer into a linear layer.
What exactly would you like to know?
Currently the code snippet seems to be missing some important parts.
Do you have trouble implementing some architecture?
Currently the number of input features to fc1 is set as 512*512=262144, which won’t work in this architecture.
Note that you are using pooling layers, which decrease the spatial size with a factor of 2 in your setup.
Also you are not using any padding in the conv layers, which will reduce the spatial size by 2 pixels using a kernel size of 3.
Here is the working and annotated model:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 32, 3)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(32, 64, 3)
self.fc1 = nn.Linear(64*126*126,1500)
self.fc2 = nn.Linear(1500, 3)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x))) # output size: [batch_size, 32, 255, 255]
x = self.pool(F.relu(self.conv2(x))) # output size: [batch_size, 64, 126, 126]
x = x.view(x.size(0), -1) # output size: [batch_size, 64*126*126]
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
x = torch.randn(1, 3, 512, 512)
model = Net()
output = model(x)
The output sizes are beautifully explained in Stanford’s CS231n.
In my architecture each conv layer will reduce the spatial size by two in both dimensions, while the pooling layers will divide them by two.
For an input size of [batch_size, 3, 512, 512] you will end up as:
Sure, the -1 just means that PyTorch should automatically fill up this dimension, so that the number of elements stay the same.
You could of course specify is manually:
No, each passed value results in the desired shape of the new view on this tensor.
Have a look at this post for some additional information. It’s more focused on the contiguous call, however it might also give you a better idea about the view operation.