How to apply multiple models to multiple batches in parallel

Hi. I am trying to calculate per-sample gradient using this idea:

So, I have N models and N batches (models are exact copies of each other, sizes of all batches are equal to 1). I have the following code:
outputs = [m(b) for m, b in zip(models, batches)]
Is there a way to parallelize the loop? In my specific case sequential execution of the loop is inefficient.