No Performance Improvement with 2 GPUs

I am using the code from https://github.com/meetshah1995/pytorch-semseg to train a model. I have been training on 1 NVIDIA Titan V and am now trying training on 2 NVIDIA Titan Vs.

I am wrapping the model in DataParallel. My understanding is that using 2 GPUs should enable me to increase my batch size but that is not the case. When I increase the batch size by any amount over the number I had for one GPU, I get the same CUDA runtime out of memory error in THCStorage.cu that I got when I was finding the max batch size for 1 GPU.

Using nvidia-smi -a I see that in the 2-GPU setup, one GPU is using the same amount of frame buffer memory as in the 1-GPU setup and the second GPU is using about 2/3 of the frame buffer memory.