I’ve been trying to adjust this code in order to switch the model from vqgan f=16 to f=8, but something seems wrong with the size of the code and it fails to initialize.
Heres the code where the error is:
sideX=256
sideY=256
class Pars(torch.nn.Module):
def __init__(self):
super(Pars, self).__init__()
self.normu = torch.nn.Parameter(o_i2.cuda().clone().view(batch_size, 1024, sideX//16 * sideY//16))
self.ignore = torch.empty(0,).long().cuda()
self.keep = torch.empty(0,).long().cuda()
self.keep_indices = torch.empty(0,).long().cuda()
def forward(self):
mask = torch.ones(self.normu.shape, requires_grad=False).cuda()
mask[:, :, self.ignore] = 1
normu = self.normu * mask
normu.scatter_(2, self.ignore.unsqueeze(0).unsqueeze(0).expand(-1, 256, -1), self.keep.detach())
return normu.clip(-6, 6).view(1, -1, sideX//16, sideX//16)
When I run Pars().Cuda() the shape is [1, 1024, 16, 16], but I think it should be [1, 256, 16, 16] since I get the error:
RuntimeError: Given groups=1, weight of size [256, 256, 1, 1], expected input[1, 1024, 16, 16] to have 256 channels, but got 1024 channels instead
I can’t figure out where that channel number of 1024 is being defined?