Creating smaller model from original LLaMA models

Do you have a dataset on which the output should be similar?
Then, you can train your model to match the original one’s output and (likely also) intermediate results.

Best regards

Thomas