Copy model weights between processes


I have a few processes running on the CPU that generate data and one process (potentially more in the future) where I consume the produced data to train my model. The issue is that when all the processes are on the CPU (including the training process) the loss drops, but when the model is on the GPU (MPS specifically), the loss does not drop. What am I doing wrong? Below is the logic of my code:

model = MyModel()
model.share_memory() # So that all processes share the same weights
train_model = copy.deepcopy(model)
train_model = accelerator.prepare(train_model) # I use the hugging face accelerator to set the device

Thanks in advance