PyTorch weight sharing siamese

amsalehoof · January 29, 2021, 8:08pm

Hello there!
I want to have a simple Siamese CNN which uses a “Shared weight” feature extractor. I have implemented the network like this:

class CnnSiamese(nn.Module):
	def __init__(self):
		super(CnnSiamese, self).__init__()
		self.feature_extractor = nn.Sequential(OrderedDict([
			('conv1', nn.Conv2d(in_channels=3, out_channels=10, kernel_size=3, padding=1)),
			('maxpool1', nn.MaxPool2d(2)),
			('relu1', nn.ReLU()),
			('conv2', nn.Conv2d(in_channels=10, out_channels=20, kernel_size=3, padding=1)),
			('maxpool2', nn.MaxPool2d(2)),
			('relu2', nn.ReLU()),
			('conv3', nn.Conv2d(in_channels=20, out_channels=30, kernel_size=3, padding=1)),
			('dropout', nn.Dropout2d()),
			('maxpool3', nn.MaxPool2d(2)),
			('relu3', nn.ReLU()),
		]))
	
	self.gap = nn.AdaptiveAvgPool2d((1, 1))
	self.fc = nn.Linear(30, 2)
	
	def forward(self, x1, x2):
		x1 = self.feature_extractor(x1)
		x2 = self.feature_extractor(x2)
		x = torch.abs(x1 - x2)
		
		x = self.gap(x)
		x = x.view(x.shape[0], -1)
		x = self.fc(x)
		return x

The problem is that when I test the model with a single image and the copy of it as the input for the model, I expect that the output of self.feature_extractor(x1) and self.feature_extractor(x2) will be the same but I have got different tensors.

It seems that whenever I call self.feature_extractor on an image, a new instance is created so if I call it twice on a single image, I got different tensors.

How Can I solve this pronlem?

Thanks in advance.

ruotianluo · January 30, 2021, 3:44am

Because there are dropouts which are stochastic. Set the model to eval() mode, you should get identical tensors.