Using a pre-trained PyTorch model (N C HW format) but my acceleration platform requires model in NHWC format.
Is there an easy way for converting PyTorch model to NHWC format?
I have permuted the weights by fetching weights from PyTorch’s state_dict() method like -
if('conv' in str(key)):
params[key] = value.permute(0,2,3,1)
But am unable to repopulate the model with permuted dictionary - model.load_state_dict(Updated_params) gives -
size mismatch for stage4.2.branches.2.3.conv1.weight: copying a param with shape torch.Size([128, 3,
3, 128]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
To resolve this, how can I define layers in the new model in NHWC format in PyTorch.
x = torch.randn(1, 128, 128, 3)
# your order
x.permute(0,2,3,1).shape # torch.Size([1, 128, 3, 128])
# correct order
x = x.permute(0, 3, 1, 2)
x.shape # torch.Size([1, 3, 128, 128])
And the error corresponds to this issue.
I am not sure still it could work fine or not because of channel mutation, because of the forward method. For instance, concatenation, squeezing, etc and all other method which use dim argument to do the operation, if exist in the forward function, may cause issues. You may need to override forward function w.r.t. channel changes.
What was the issue with this tutorial? I think it is ok as you can send all layers to channel last or channel first mode. I am not sure I am missing something here.
It appears memory_format = torch.channels_last is not converting the layers/input to NHWC format. It performs a different functionality. Pytorch Channels Last Memory Format webpage also doesn’t mention NHWC anywhere in the whole webpage.
I hope I have put my requirement correctly. Thanks.
Can you please clarify what do you mean by acceleration platform in this case. PyTorch operators (and modules) require CV tensors to be in specific indexing order NCHW. To use accelerated NHWC kernels we preserve dimensions order but laying out tensor in memory differently.
Acceleration platform is a custom processor. I am currently using the pytorch model, converting it to onnx/keras and then porting it to the processor for inference. But face this challenge of NHWC vs NHWC.
If possible, can you please shed some light on the possibility of updating the model definition after permuting the layers, as mentioned in my original comment?
All PyTorch operators are written to take NCHW as dimensions order. There is no way to change it (you can only change memory format - aka how tensor laid in memory).
If you really want to change the order of dimensions you would need to permute each model parameter manually. Take into account that your model will not work in PyTorch anymore.