Load an onnx model with ConvTranspose2d group > 1

A pretrained model has the following layer

nn.ConvTranspose2d(64, 64, kernel_size=2, 
      stride=2, padding=0, output_padding=0, groups=64, bias=False)

the model was converted to onnx.

During compilation using command:

./build/bin/model-compiler -g -model model.onnx -emit-bundle ./bundle -backend=CPU

an assertion fails:

  assert(filterDims.n % group == 0 && filterDims.h == kdim.height &&
         filterDims.w == kdim.width && filterDims.c == idim.c / group &&
         "Invalid filter dims");

for my case:

idim = (1, 64, 15, 20)
filterDims = (64, 1, 2, 2)
group = 64

so filterDims.n % group != 0 and filterDims.c != idim.c / group.

What wrong with the layer parameters and what the assertion checks?

What does model-compiler do and when is it used?
Do you know, how kdim is defined and what it stands for?
It seems that it’s another placeholder for the kernel and tries to compare the shapes with the weight tensor in the transposed convolution?

I’m testing glow compiler, model-compiler creates model bundle and can profile model.
kdim means “kernel dim”.

Sorry, I should have posted the whole method code

static void assertConvTransposeDims(NodeValue input, NodeValue filter,
                                    NodeValue bias,
                                    llvm::ArrayRef<unsigned_t> kernels,
                                    llvm::ArrayRef<unsigned_t> strides,
                                    llvm::ArrayRef<unsigned_t> pads,
                                    unsigned_t group) {
  ShapeNHWC idim = ShapeNHWC(input.dims());
  (void)idim;
  ShapeHW kdim(kernels);
  (void)kdim;
  assert(idim.c % group == 0 && "channels number must be divisible by groups");

  // NOTE: here the N in NHWC is abnormal because it is the number of filters
  // (and therefore the number of output channels of the conv) and not the
  // batch size. The rest of the dimensions are representative of the input
  // dimensions to the convolution.
  ShapeNHWC filterDims(filter.dims());
  (void)filterDims;

  assert(filterDims.n % group == 0 && filterDims.h == kdim.height &&
         filterDims.w == kdim.width && filterDims.c == idim.c / group &&
         "Invalid filter dims");

  assert(bias.getType()->size() == filterDims.n && "Invalid bias size");
}

It looks like ConvTranspose2d must have out_channels=in_channels*group to pass the assertion.

in_channels and out_channels should be both divisible by groups.
If your code is working fine in PyTorch, I’m a bit confused if glow adds some other checks to the transposed convolutions.

Did you narrow it down to a nn.ConvTranpose2d layer, which creates the issue in the export?

I’m a bit confused if glow adds some other checks to the transposed convolutions.

Moreover the same check works for Conv2d layers.

I managed to convert the model with group=1. Obviously, I have to retrain the model with new layer.

Anyway I wonder what’s wrong with original case in my initial post.

That’s what I’m wondering. If your model works fine in your PyTorch code, I would assume that it should also pass all Glow checks.
Are you seeing this error for all settings other than groups=1?

I’ve tested with group=2 and group=32 – it fails on the same assertion assertConvTransposeDims.
If I comment the assertion it fails somewhere later.

Should I create a github issue?

1 Like

Yes, please create an issue, as I’m not familiar enough with Glow’s internals.

Github issue