Multi GPU Training

partha · March 17, 2021, 3:00am

Hi, I have a model that requires 14 GB to train. However, I have access to 12 GB GPU nodes. If I use two nodes (that is 24 GB) still I am getting not enough memory CUDA error. Can you help me overcome this error?

edwardpwtsoi · March 17, 2021, 6:24am

If your model cannot fit into single GPU you would need to do model parallelism