Computing list of variable-sized tensors faster

I am implementing one component in my NN, which consists of several nn.functional operations (parameter-free). The component takes a list of tensors as input for one batch. Each tensor of the list is variable-sized(might be empty); one tensor for one entry in the batch. Each row of the tensor is some kind of feature. I don’t want to use for loop to process those tensors and perform sum pooling for each tensor after this component, and concat them. Is there any way to do such computation in parallel?