Using 2 GPUs for Different Parts of the Model

DuaneNielsen · December 21, 2019, 4:15am

One approach…

Start 2 python programs, in separate interpreters to avoid the dreaded GIL lock.

Processor 1

Processor 2

If you need to send gradients for backprop you can store and reload them also.

That’s one way… Not easy though. I spent easy a month just trying to distribute calculations over multiple processors.

If you can pull it off… then it’s an awesome skill.

I tried using it. It had great promise, but ended up being a bit too new at the time. It might be a bit more mature now.