Reuse part of module without accumulating gradient

With Encoder-Decoder based models, we can encode the feature from the inputs by encoding and decoding.

My question is how to prevent duplicate gradient calculating.
e.g., input -> enc -> features -> dec -> outputs.
Then we would like to obtain the feature of outputs to compute loss like triplet loss: enc(outputs)
Due to enc(outputs) would pass encoder twice, we would like to obtain the gradient for the first time.

I get two options:
1: enc(outputs.detach())
2: torch.no_grad: enc(outputs)

I am not sure which one is the correct way.

If you would like to use the output of enc(output) as the target only and it thus wouldn’t require gradients, I think both methods would work.

Feel free to correct me, in case I misunderstood the use case.

Thanks for your reply. But, I would like to use the feature of enc(output) for metric learning loss. Due to output already containing the gradient from encoder, I would like to ignore the computing gradient while passing the encoder for the second time.

input -> enc -> features -> dec -> outputs -> enc (not computing graident) -> features

If features from this calculation: features = enc(output) # no gradient does not require gradients, you should be fine using either approach.
This would be the use case, e.g. if you are using this features output as the target for a specific loss calculation.