Should all nodes be running on the same pytorch version for multi-node training?

Should all nodes be running on the same pytorch version for multi-node training? and/or do all nodes that I’m going to be running for multi-node training need to have the same library versions?

For example, if one of my nodes is running on torch==1.10.1, do all of my other nodes need to be running on torch==1.10.1 also or can it run on another torch version such as torch==1.13.1?
Also, if one of my nodes is running on transformers==4.20.1, do all of my other nodes need to be running on transformers==4.20.1 also or can it run on another huggingface version such as transformers==4.26.1?

I would stick to the same versions in all libraries as I doubt anyone is testing any diverged configuration and it could thus easily break.

I see. I was wondering if you know anyone who has accidentally run different library versions for different nodes?

Because I accidentally run one of my experiments with different nodes with different library versions and I am not sure if that result is reliable or not.