How to go about training YOLOv5 as an autoencoder?

I am trying to train YOLOv5 on a custom dataset. Since the custom dataset is not too big, I would like to fine-tune an autoencoder based on YOLOv5 on an unlabeled dataset first and then fine-tune it further on the custom dataset so that it generalizes better.

My question is how can I approach this problem. Where can I make the changes to use a YOLOv5 model instance as an autoencoder?

Any guidance would be much appreciated. Thanks.