We have a python script with pytorch that runs a prediction

I’m somewhat lost in the jungle of documentation, offers, and administrations. I’m wondering how the infrastructure should look like, and it would be extremely useful to get a bump the correct way.

We have a python content with pytorch that runs a forecast. The content must be activated from a http request. Ideally, the examples to complete a forecast on likewise needs to originate from a similar requester. It needs to restore the expectation as quick as could reasonably be expected.

What is the best / easiest / fastest way of doing this?

We have the content laying in a Container Registry for the present. Would we be able to utilize it? Azure Kubernetes Service? Azure Container Instances (is this quick enough)?

Thanks,
Ruth Kimberly