Use of torch.utils.checkpoint.checkpoint causes simple model to diverge

It seems that this problem has been fixed. See this reply and maybe the entire thread.