Finetuning pretrained Transformers using pipeline parallelism

Hello. I’m looking for a solution analogous to
https://pytorch.org/tutorials/intermediate/pipeline_tutorial.html
with the only difference. The tutorial says how to pipeline-parallelize while training from scratch, and I need to pipeline-parallelize while finetuning an already pretrained Transformer.
Can anyone tell how to do such a finetuning?