Loading training data to a dataset in Mask RCNN

I want to train my own torchvision.models.detection. maskrcnn_resnet50_fpn model with my own data. How should I organize my data and load it to a Dataset \ DataLoader?
The example in https://pytorch.org/docs/stable/torchvision/models.html#mask-r-cnn is a toy example with random data.

Thanks