For every layer of ResNet, I would like to assign the average value of convs to each conv, every time after they’re updated by
class ResNet(nn.Module): def __init__(self, block, num_blocks, num_classes=10):... def _make_layer(self, block, planes, num_blocks, stride):... def forward(self, x):... def setAvg(self): for layer in [self.layer1,self.layer2,self.layer3,self.layer4]: weight_conv_average = torch.mean( torch.stack([torch.Tensor(p) for n, p in layer.named_parameters() if 'conv' in n and (p.size()==p.size())]), 0) for n, p in layer.named_parameters(): if 'conv' in n and (p.size()==p.size()): #print(n,' ', p.size()) p.data.copy_(weight_conv_average)
And in the main part of training, I use
rawnet = ResNet18() net = rawnet.to(device) if device == 'cuda': net = torch.nn.DataParallel(net) cudnn.benchmark = True ... def train(epoch): ... loss.backward() with torch.no_grad(): rawnet.setAvg() optimizer.step()
However this will cause
Segmentation fault (core dumped) with Linux cuda env.
I’ve tried to invoke
forward but also failed.
And the reason why I distinguish
rawnet is that
DataParallel object, which has no attribute