Gradient Checkpointing in Pytorch?

Even_Oldridge · January 29, 2018, 3:07am

I’m curious to hear the developers thoughts on Gradient Checkpointing.

Memory limitations are one of the biggest restrictions I encounter both with pytorch and with deep learning in general and this seems like an interesting and possibly fruitful solution, particularly if it were baked right into the library itself.

SimonW · January 29, 2018, 4:15am

There is some WIP on checkpointing at https://github.com/pytorch/pytorch/pull/4594