Host llama2 13b as sagemaker endpoint

Could create an endpoint as above for llama 13b base, but it gives a timeout error on container primary for 13b chat.

For above, created the neuron artifacts for the 13b chat model using this -

Could start torchserve and run inference via curl command here, so the model artifacts look okay. But the same artifacts won’t work in the first notebook reference link.