Can we concatenate two GPUs to run single big model?

ptrblck · September 14, 2020, 8:03am

You could use model sharding, i.e. executing different parts of the model of different devices, as given in this example.