Tensor raw memory organization


I am developing models that are exported through ONNX and used by TensorRT. However, I am not observing the same results (not even similar) for the same images processed by the two executors (native pytorch and tensorrt). My first hunch was data loading and memory organization.

I want to compare the memory fed into executors and be sure that different executors work over the same data.

The questions:
How is tensor organized in raw memory?
Where is up in the image loaded by torchvision.io.read_image or is origin in upper left corner or lower left corner?
How do I print out raw memory of tensor? One of the things I tried print(X.data_ptr().to_bytes(batchchannelsheightwidth(4 bytes per float), sys.byteorder)) however, I got a lot of \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00… which seems incorrect

In TensorRT the memory is organized following:

  • batch offset → w*h*c*i; where i is batch number
  • channel offset → batch offset + w*h*i; where i is channel index
  • row offset → channel offset + w*i; where i is image row index
  • data element (column element) → row offset + i; where i is element index
    Assuming that the pointer is of matching data type.
    Where w is width, h is height, c is number of channels, and origin of the image is in upper left corner.

By default PyTorch uses the channels-first layout as [batch_size, channels, height, width].
I’m not familiar with your use case, but wouldn’t the torch_tensorrt module work?