As a result when I run the model in Caffe2, it gives me the same output as on the image I used to export the network.
My questions are:

PyTorch Docs claim that torch.cat is supported in ONNX export, if yes then why is the ONNX export returning a constant array instead of concatenated tensors?

Is there some special/secret ingredient that I am missing here.

Any/all help is appreciated as I am stuck here for quite long.

@Rizhao_Cai I donâ€™t remember the exact code/bug in my code since its been a few months old. But basically somewhere in the code I was detaching the tensor from the graph that PyTorch uses to compute auto-grads.
Example:

As a result of .detach() pyTorch removes that tensor from backprop calculation and the pyTorch to ONNX export treats such a tensor as a constant tensor and just remembers the values instead of remembering the mathematical operation. Documentation of detach() method

The .data operation is a bit different. Since each tensor is an object of class Tensor(), it has a property .data which contains the values of the tensor. I am not sure about the exporting behavior of .data in pyTorch + ONNX. You can do a small experiment to test it out.