Efficient block-sums of tensors

Say I have a vector V with size m*n and I want to sum over each consecutive m elements to get a vector with size n. What is the efficient way to do this? And also, what if the chunks that I want to sum are uneven (the widths are saved in a list or another tensor)?

It works pretty much the same as numpy, so you just need to use torch.sum(V, dim) where dim is the dimension where m is stored, e.g. if V is (m, n) -> dim=0.
More info the official documentation: https://pytorch.org/docs/stable/torch.html#torch.sum

This wasn’t the question. I don’t want to do a full summation in one of the dimensions. I want to sum over each m elements in that dimension. One way is to split the vector into n parts to the summation for each and then gather the results into a new vector. I was wondering if there was something built-in for the sake of efficiency. PS equal parts are trivial, I am generally concerned about when the parts are unequal.

The current most efficient way might be to create an index tensor and use .scatter_add_ to create the sums.