Same DNN model on raspberry and windows laptop produces different output

Hello Everyone,
I am running a yolov5 model on raspberry pi 4. I have torch 1.8.0a0+37c1f4a installed on it using a build wheel I found online. I also have same model running on my laptop. It has torch 1.8.0+cpu version.
I am using yolo to detect objects in a video. When i pass a frame from video to both raspberry pi and my laptop their output is different. I have checked input value of frame they are same for both platforms. I have also checked values of weights for some layers of yolo pretrained model they are same on both platforms. I am unable to understand that why output has different values(These values ae significantly different) is different.


I have drawn red contour around the code where difference happen. I am using same code for both devices. I using code from following github repo.

Thanks for your help.

Can you provide a small example of how the outputs differ so we can get a better idea of what the issue is? It may also be worthwhile to inspect the intermediate outputs of the model (and the “augmentations” depending on what is in opt.augment) to see where the differences first appear.


Above Image shows output tensor on laptop.


Above image shows output tensor on raspberry pi. Input is same video file. opt.augment is false for both cases.

I’m not familiar with the shape dimensions here; what do 15120 and 7 correspond to?

Have you tried checking the intermediate outputs of the model (e.g., after each layer) to see where the differences start?

Network made 15120 predictions for each image. every prediction have 7 parameters. 4 for bounding box, one for object confidence and two for class confidence. It is detecting objects of only two classes from every image. I have checked output at each layer. I think i have to do that.