I am a beginner in deep learning. I am trying to run a GAN (UNet based) model on the data - implemented in pytorch. but when I try to run the code. I get the following error :
**KeyError: ‘generatorX’ **
I am using the pretrained model / checkpoint (a file already provided by authors with .pth extension.
I cannot figure out what is ‘generatorX’. When I print the model I can see weights.
Any help will be highly appreciated.
@Diego Thank u very much for the timely response.
When I make the change as u suggested,
generatorX.load_state_dict( checkpoint,strict=False)
I get this error . Although I used the images with same dimensions as used by authors.
RuntimeError: Error(s) in loading state_dict for Generator: size mismatch for conv8.2.weight: copying a param with shape torch.Size([16, 48, 5, 5]) from checkpoint, the shape in current model is torch.Size([16, 48, 1, 1]). size mismatch for conv9.2.weight: copying a param with shape torch.Size([3, 16, 5, 5]) from checkpoint, the shape in current model is torch.Size([3, 16, 1, 1]).
So I need to train from scratch . can i use this pretrained model ?
You are probably defining the wrong model for generatorX since the weights you are trying to load don’t match the model. Is there a different model to try? Or a different set of weights?
Thank you very much for reply. I run another file where model is trained from scratch.
I am getting the following errors: 'Caught RuntimeError in replica 0 on device 0’
in the line
fakeEnhanced = generatorX(realInput)
and the following error: ’Sizes of tensors must match except in dimension 2. Got 64 and 33 (The offending index is 0)'
where generator method is defined inside the model.py file in the line:
x6 = self.conv7(torch.cat([x5, x53_temp], dim=1))
I searched to correct the errors. Some people say that maybe there is some issue of DataParallel module and the original code was written to run on multiple GPUs.
Do u have idea if it could be the case
It seems x5 and x53_temp are not having same dimensions. Oddly, it says the batch dimension is not matching. Can you print the model definition please?
Thanks for this, although this is informative, I cannot discern the shapes of x5 and x53_temp from this, sorry. I need the class definition (mainly the forward and init functions) of generator maybe…
@user_123454321 thank you very much for the fast response. The error occurs in the forward function of Generator actually.
One thing I noticed is that the size of images forward takes is 516x516 (height and width) when I print. Although I explicitly resized the images to 512 x512 and this is the size network accepts . This is the apparent reason of error. I am trying to check in the DataLoader where this dimension mismatch is occurring.
I will post here if there is still tensor size mismatch in the later layers of network.
thanks a lot for your suggestions.