If the above doesn’t work for your use case, could you illustrate the specific
computation you want with a self-contained, runnable script that uses loops,
as necessary?
To do this on a “batch” basis, probably the simplest approach is to
generate a tensor that hold the indices of all of the pixels in your 32x32
slices and then compute the foreground-pixel weighted average of those
indices:
Note, if a 32x32 slice happens to contain no foreground pixels, you will get nan (before converting to long()) for the mean foreground pixel location
for that slice. That’s probably as reasonable an “undefined” or “special-case”
value as any.