Object counting using custom images with Pascal VOC style annotations

I have a set of PNG images and I have created Pascal VOC XML files for annotations corresponding to each image. I’m looking for a reference as to how I can create a data loader and train a model to do object counting.