Typeerror - RandomIoUCrop() requires input sample to contain tensor or PIL images and bounding boxes. Sample can also contain masks

Hi,
I am getting started on torch vision to train and evaluate object detection models. I am getting into issues and need some help.
I am setting up basic steps to evaluate a pre-trained model on coco 2017 dataset. Here is my colab notebook.

I am using this example from PyTorch to set up ‘data loader’ for coco dataset.
https://pytorch.org/vision/main/auto_examples/transforms/plot_transforms_e2e.html#sphx-glr-auto-examples-transforms-plot-transforms-e2e-py

After the data loader steps, I added steps to train / evaluate using ‘engine.py’.

Any help to resolve this issue will be appreciated. Or any suggestion on alternate ways.

Thanks,

Amit