How to run two training methods in parallel on exactly the same data

f10w · September 24, 2017, 11:17pm

Hello,

I have script similar to the following:

# Loading data
train_loader, test_loader = someDataLoaderFunction()

# Define the architecture
model = ResNet18()
model = model.cuda()  

# Get method from program argument
method = args.method

# Training
train(method, model, train_loader, test_loader)

In order to run the script with two different methods (method1 and method2), it suffices to run the following commands in two different terminals:

CUDA_VISIBLE_DEVICES=0 python program.py --method method1
CUDA_VISIBLE_DEVICES=1 python program.py --method method2

The problem is, the above data loader function contains some randomness in it, which means that the two methods were applied to two different sets of training data. I would like them to train the exact same set of data, so I modified the script as follows:

# Loading data
train_loader, test_loader = someDataLoaderFunction()

# Define the architecture
model = ResNet18()
model = model.cuda()  

## Run for the first method
method = 'method1'

 # Training
train(method, model, train_loader, test_loader)

## Run for the second method
method = 'method2'

# Must re-initialize the network first
model = ResNet18()
model = model.cuda()

 # Training
train(method, model, train_loader, test_loader)

Is it possible to make it run in parallel for each method?
Thank you so much in advance for your help!

smth · September 28, 2017, 3:51am

you can make your Data augmentations random, but deterministic. And after that, you can use the script with two different command-line args as you have shown.

I think adding this to the top of your script (and maybe your Dataset) will make your data loader deterministic across runs:

manual_seed = 1234
random.seed(manual_seed)
np.random.seed(manual_seed)
torch.manual_seed(manual_seed)

f10w · October 2, 2017, 11:39am

Thank you so much, @smth!!