I believe the example shows that single images are still passed to TorchServe, but a batched inference will be used if multiple requests would be received in the specified max_batch_delay time window. At least this is how I understand the example, but others can of course correct me.