How to train model with multithreading

xwgeng · March 15, 2017, 10:26am

Hi, guys
Is there any method to train model with multithreading. For my model, every input has a different structure, so I can’t use mini-batch and the speed is very slow. I expect to accelerate training via multi-CPU

smth · March 15, 2017, 10:07pm

You can look at the hogwild training example we have:

xwgeng · March 16, 2017, 3:08am

In the hogwild training example, every process employs a separate dataloader. and is it available that create a process pool, and every process consumes an input sample? @smth

apaszke · March 16, 2017, 9:29am

You can’t share data loaders among processes, it’s going to be very tricky (because of how Python’s multiprocessing works).

xwgeng · March 16, 2017, 9:51am

Thanks for the reply!

williamdjones · March 21, 2018, 1:16am

By passing the model as an argument to say the training function, is this not creating a separate local copy of the model for each process? I ask because I followed the Hogwild! example with my own network/data and have found that while each process tends to have the same optimization behavior, there is variance in the weights at the end of training, at each epoch that I have a checkpoint for. I do expect variance in things such as loss over time because the processes should be loading different random subsets of the same data, but it doesn’t make sense to me why the weights would be different if calling model.share_memory() should be placing the model object in a shared memory segment where there should only be one copy of weights that are manipulated. The only explanation that I can think of as to why my weights/biases would vary across processes would be if the processes are actually only using a local copy of the model…but perhaps I am seriously misunderstanding something here.

williamdjones · March 21, 2018, 4:30pm

to revise this slightly…I believe I understand what the “problem” actually may be. Since it is more likely that the processes finish each epoch at different times rather than all simultaneously at a point in time, and each process is calling optimizer.step() asynchronously then saving the parameters at the end of an epoch…it would make sense that the values of the weights would be slightly different due to this.

So at the beginning of the training, all processes should have the same initial weights…but over the course of training it is likely that the processes will not actually have copies of the model that are identical to those of the other processes, but the parameters will generally be more or less similar.