For instance, if I have a network as follows -
class SomeClassifier(nn.Module):
def __init__(self):
super(SomeClassifier, self).__init__()
# all three conv blocks maintain the original image size.
self.cnn1 = nn.Conv2d()
self.cnn2 = nn.Conv2d()
self.cnn3 = nn.Conv2d()
self.cnn4 = nn.Conv2d()
self.fc = nn.Linear()
def forward(self, img):
img_rep = self.cnn1(img)
x = self.cnn2(img_rep)
y = self.cnn3(img_rep)
z = self.cnn4(img_rep)
# perform custom operation on x, y, and z and merge them into a single tensor called p. p is the same shape as x, y, and z.
p = torch.concat(x, y, z) # or average the 3 tensors so that we have a new tensor that is the same shape as x,y, and z.
p = torch.flatten()
p = self.fc(p)
return p
The point being, If I have an input image:
- I pass it through 3 different conv filters.
- I perform some custom operations on the output of each of the 3 conv filters.
- Finally, I merge the resulting 3 feature maps into a single tensor. The shape of this new filter is the same size as the
img_rep
.
This is clearly a block of operation I can perform again and again. I cannot use nn.Sequential
for this because I have a multiple outputs here. I need a function. So, can I do the following?
class SomeClassifier(nn.Module):
def __init__(self):
super(SomeClassifier, self).__init__()
# all three conv blocks maintain the original image size.
self.cnn1 = nn.Conv2d()
self.fc = nn.Linear()
def forward(self, img):
img_rep = self.cnn1(img)
p = self.custom_module(input_img_rep=img_rep, in_channel, out_channel)
p = torch.flatten()
p = self.fc(p)
return p
def custom_module(self, input_img_rep, input_channels_to_this_block, output_channels_from_this_block):
cnn2 = nn.Conv2d()
cnn3 = nn.Conv2d()
cnn4 = nn.Conv2d()
x = cnn2(img_rep)
y = cnn3(img_rep)
z = cnn4(img_rep)
# perform custom operation on x, y, and z and merge them into a single tensor called p. p is the same shape as x, y, and z.
o = torch.concat(x, y, z) # or average the 3 tensors so that we have a new tensor that is the same shape as x,y, and z.
return o
When I do this, I have issues with the GPU. Basically, the weights for the custom_block
are still on the CPU while the weights of the rest of the network are on the GPU.
This is because I’m not initializing this custom_block
method in the __init__()
. So, let’s say I do that.
class SomeClassifier(nn.Module):
def __init__(self):
super(SomeClassifier, self).__init__()
# all three conv blocks maintain the original image size.
self.cnn1 = nn.Conv2d()
self.custom_block = self.custom_module(...)
self.fc = nn.Linear()
def forward(self, img):
img_rep = self.cnn1(img)
p = self.custom_block(...)
p = torch.flatten()
p = self.fc(p)
return p
def custom_module(self, input_img_rep, input_channels_to_this_block, output_channels_from_this_block):
cnn2 = nn.Conv2d()
cnn3 = nn.Conv2d()
cnn4 = nn.Conv2d()
x = cnn2(img_rep)
y = cnn3(img_rep)
z = cnn4(img_rep)
# perform custom operation on x, y, and z and merge them into a single tensor called p. p is the same shape as x, y, and z.
o = torch.concat(x, y, z) # or average the 3 tensors so that we have a new tensor that is the same shape as x,y, and z.
return o
The problem with this is, my custom_block
requires an input image representation. So, I can’t really initialize it in __init__()
. I have to directly use in the forward()
method (which leads to not all weights being transferred to the GPU.)
So, how do I go about using this custom_block()
in PyTorch?