When I was using Dataparallel with multi GPUs,a warning below appear:
There is an imbalance between your GPUs. You may want to exclude GPU 0 which
has less than 75% of the memory or cores of GPU 1. You can do so by setting
the device_ids argument to DataParallel, or by setting the CUDA_VISIBLE_DEVICES
environment variable.
It means you machine has different GPUs which differ a lot regarding their performance.
If you would like to use DataParallel the weak GPUs will most likely be a bottleneck in your code. That’s why you can just ignore the weak GPUs using device_ids or CUDA_VISIBLE_DEVICES.
The code of the check and warning is defined here.
I have the same warning, and I solve it by ''from future import division ‘’. because in python 2.*, 1/2 = 0.
so in the function" warn_imbalance"
def warn_imbalance(get_prop):
values = [get_prop(props) for props in dev_props]
min_pos, min_val = min(enumerate(values), key=operator.itemgetter(1))
max_pos, max_val = max(enumerate(values), key=operator.itemgetter(1))
if min_val / max_val < 0.75:
warnings.warn(imbalance_warn.format(device_ids[min_pos], device_ids[max_pos]))
return True
return False
can you please help me with this issue.
I have 3 GPUs, 1 is 24GB and 2 are 11GB.
when I have batch size 4, it gives me gpu cuda memory error. However, I am sure the total gpu memory of 3 gpus is capable of handleing it.
How I can make sure to send 2 batches to gpu 0 and 1 batch to gpu 1 and one batch to gpu 2?