What here is defining the channels in the initialization?

cazforshort · July 21, 2021, 10:00pm

I’ve been trying to adjust this code in order to switch the model from vqgan f=16 to f=8, but something seems wrong with the size of the code and it fails to initialize.

Heres the code where the error is:

sideX=256
sideY=256
class Pars(torch.nn.Module):
    def __init__(self):
        super(Pars, self).__init__()

        self.normu = torch.nn.Parameter(o_i2.cuda().clone().view(batch_size, 1024, sideX//16 * sideY//16))
        
        self.ignore = torch.empty(0,).long().cuda()

        self.keep = torch.empty(0,).long().cuda()

        self.keep_indices = torch.empty(0,).long().cuda()

    def forward(self):

      mask = torch.ones(self.normu.shape, requires_grad=False).cuda()
      mask[:, :, self.ignore] = 1
      normu = self.normu * mask
      
      normu.scatter_(2, self.ignore.unsqueeze(0).unsqueeze(0).expand(-1, 256, -1), self.keep.detach())


      return normu.clip(-6, 6).view(1, -1, sideX//16, sideX//16)

When I run Pars().Cuda() the shape is [1, 1024, 16, 16], but I think it should be [1, 256, 16, 16] since I get the error:

RuntimeError: Given groups=1, weight of size [256, 256, 1, 1], expected input[1, 1024, 16, 16] to have 256 channels, but got 1024 channels instead

I can’t figure out where that channel number of 1024 is being defined?

cazforshort · July 21, 2021, 10:10pm

The answer was

return normu.clip(-6, 6).view(1, -1, sideX//8, sideX//8)