Quantize Per-Trained model Using QLoRa or LoRa , PFET Technique

i would like to ask how can I use QLoRa or Parameter-Efficient Fine-Tuning thin a model does not register at Hugging face instead is Based on OFA

Here the repo of the model: GitHub - taokz/BiomedGPT: BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks

i am trying to Quantize the Tiny version but I don’t know if I need to use Lora in which way to Parameter-Efficient Fine-Tuning