As models grow more complex I feel a need to create classes that group multiple inputs. For instance, in DETR-based models adapted to 3D object detection, we might have queries comprise
- content as (N, B, D) tensor
- 3D positional encodings as (N, B, D) tensor
- 2D positional encodings, where the 3D positions have been projected into some camera and then encoded, as (N, B, D) tensor
. To that end, I designed a dataclass that comprises these three attributes and that has some utilities (deep copies, pytorch-like indexing, …). This also enables substantially more informative type-hints. However, I cannot export such models to ONNX. For ONNX, each module can take only Tensors, Tuples, Lists, and Dicts as input. Dicts are even advised against. Would it be possible to combine the best of these worlds: (i) nice code for complex models; and (ii) possibility to export to ONNX?
Any advice would be much appreciated!