DiffusionInst training with publaynet dataset

I am trying to run an Image segmentation training on the publaynet dataset with this github codebase. GitHub - chenhaoxing/DiffusionInst: This repo is the code of paper "DiffusionInst: Diffusion Model for Instance Segmentation" (ICASSP'24).
It is trying to implement the Diffusion model on top of detectron2 for the instance segmentation. It is trained on coco dataset. However when I am training it on the publaynet dataset it is not segmenting well. The loss is decreasing to 2.5 after 250000 iterations. Also before feeding the input I checked the inputs publaynet docs and bounding boxes it seems to fit well. But still, The problem that is happening the trained model is giving the output for the whole image as one segment. It is not segmenting the objects individually. What could be the possible issue here?

Also another question this whole training the model is taking at least 3 days for me. So, to check any result whether it’s trained or not properly I am able to see only after 3 days. How do I debug it without wasting so much time on the training with whole data set? Any pointer will be hlpful.