Torch model copy with computational graph

Hi All,
I’d like to create a copy of my model. However, I realized that I cannot use copy.deepcopy() since I’d like to have a “stateless” version of the model, meaning I would like to keep the computational graph and be able to compute the derivatives of the new model’s parameters w.r.t the original parameters.
So, the original module or any of its submodules have state (e.g. batch norm), this will be copied too, but further updates (e.g. during inner loop training) will cause these to diverge without changing the state of the original module.
Does anyone know how to achieve this?