In the following sample class from Udacity’s PyTorch class, an additional dimension must be added to the incoming kernel weights, and there is no explanation as to why in the course. I’ve highlighted this fact by the multi-line comment in __init__
:
class Net(nn.Module):
"""
Network containing a 4 filter convolutional layer and 2x2 maxpool layer.
"""
def __init__(self, weights):
"""
weights: the kernel values as a tensor (n_kernels, 1, k_height, k_width).
"""
super(Net, self).__init__()
# Get height and width of kernel
k_height, k_width = weight.shape[2:]
# define a 4 feature convolutional layer
self.conv = nn.Conv2d(1, 4, kernel_size=(k_height, k_width), bias=False)
self.conv.weight = torch.nn.Parameter(weight)
# define a (2x2) pooling layer
self.pool = nn.MaxPool2d(2,2)
def forward(x):
conv_x = self.conv(x)
relu_x = F.relu(conv_x)
pool_x = self.pool(relu_x)
return conv_x, relu_x, pool_x
This is also illustrated in the class notebook with the following code:
filter_vals = np.array([[-1, -1 , 1, 1]]*4)
filter_1 = filter_vals
filter_2 = -filter_1
filter_3 = filter_1.T
filter_4 = -filter_3
filters = np.array([filter_1, filter_2, filter_3, filter_4])
weights = torch.from_numpy(filters).unsqueeze(1).type(torch.FloatTensor)
I’m just wondering why the class wasn’t simply designed to take a kernel tensor that has shape (4,4,4). Why did they change it to (4,1,4,4)?