Basic operations on multiple GPU-s

Yes you can :slight_smile: See Uneven GPU utilization during training backpropagation - #14 by colllin for an example wrapping the loss function with DataParallel