How to give gradient to a mid layer of the network?

I am now struggling to implement Deep Clustering with Convolutional Autoencoders (
In the paper they have 2 loss functions; reconstruction loss and clustering loss.
The former is an ordinal loss for autoencoders and I can impliment however, the latter is something special.
It directly provides the gradient for the code layer and in optimization we must change parameters in the encoder layers only.
How can we do this?

Something like this should work

encoder = Encoder()
decoder = Decoder()
clustering_head = nn.Linear(1152, count_soft_labels)

for data, soft_labels in training_data:
    datav = Variable(data)
    labelsv = Variable(soft_labels)
    encoded = encoder(datav)
    decoded = decoder(encoded)
    pred_labels = clustering_head(encoded)
    loss = reconstruction_loss_fn(data, decoded) + clustering_loss(pred_labels, labelsv)

Great. Thanks. I will try.