I implemented YOLOv3 using libtorch and trained with Telsa V100 on the server several times until the recall level was high enough. However, when I turned the model into a CPU version and saved it, I used the same code on my laptop and found that the recall was extremely low. Therefore, I designed a comparison experiment. Using the model, code and training set exactly the same as those in the laptop on the server, the running result display was quite different from the result on my laptop in recall. As is shown in
which is the result on server, running ubuntu with CPU mode, libtorch 1.5.0,
and the result on my laptop
running windows10 with also CPU mode, libtorch 1.5.0.
You can see the apparent difference in metrics.
This is a completely incomprehensible result, and I’ve managed to control the variables as much as possible, except for the difference between the two operating systems and the hardware systems.I don’t know if this is a normal phenomenon, but if there is such a huge difference between different systems, how do you transplant pre-training weights?