Hi,
Two issues are found for pytorch 0.1.11 when I run the following Encoder CNN.
-
if param.requires_grad = False, pytorch 0.1.10 takes only a small amount of GPU compared to 0.1.11. Pytorch 0.1.11 takes almost same amount of GPU memory no matter the requires_grad value.
-
Even if param.requires_grad = True, pytorch 0.1.10 takes about half of GPU usage compared to 0.1.11. When I dig into the code , I find the second batch forward doubles the GPU usage in 0.1.11, but only negligible memory usage increases in the second batch in 0.1.10
class EncoderCNN(nn.Module):
def init(self, embed_size):
""“Load the pretrained ResNet-152 and replace top fc layer.”""
super(EncoderCNN, self).init()
self.resnet = models.resnet152(pretrained=True)
for param in self.resnet.parameters():
param.requires_grad = False
# param.requires_grad = True
self.resnet.fc = nn.Linear(self.resnet.fc.in_features, embed_size)
self.bn = nn.BatchNorm1d(embed_size, momentum=0.01)
self.init_weights()def init_weights(self): """Initialize the weights.""" self.resnet.fc.weight.data.normal_(0.0, 0.02) self.resnet.fc.bias.data.fill_(0) def forward(self, images): """Extract the image feature vectors.""" features = self.resnet(images) features = self.bn(features) return features