When I implement DAB-DETR have strange output

I use PyTorch to implement a more simple vision model of DAB-DETR. When I train 10 images in training mode and batch-size set 2 it looks like not bad, but when I am in the eval mode, my model sometimes outputs other images bounding boxes. So, I did other tests to demonstrate what happens. I change the order of the input image, I find the problem shows as well. I really can’t figure it out.

Because the code is too long, so I post my Github link here.

https://github.com/Zhong-Zi-Zeng/Simple-Version-of-DAB-DETR.git

Looks like these situations: