I wan to use gradient checkpointing and ddp, so I must use the _set_static_graph method, but it get worse performance
would you please attach a repro and report it as github issue?
I wan to use gradient checkpointing and ddp, so I must use the _set_static_graph method, but it get worse performance
would you please attach a repro and report it as github issue?