I have n
points each with an associated color
n = 100
colors = torch.rand((n, 3))
points = torch.randint(0, 10, (n, 2))
x_min, x_max = torch.min(points[:, 0]).data, torch.max(points[:, 0]).data
y_min, y_max = torch.min(points[:, 1]).data, torch.max(points[:, 1]).data
I want to discretize the point’s colors onto some target
grid
target = torch.zeros((5, 5, 3))
eps = torch.finfo(torch.float32).eps # Bucketizing from the "left"
x_steps = torch.linspace(x_min, x_max + eps, target.shape[0] + 1)
y_steps = torch.linspace(y_min, y_max + eps, target.shape[1] + 1)
I can determine which points fall within each grid location with bucketize
x_bucket_indices = torch.bucketize(points[:, 0], x_steps)
y_bucket_indices = torch.bucketize(points[:, 1], y_steps)
Though the issue comes when attempting to update target
based on the (0, 1, n)
number of colors that may be present in each of its cells/buckets/elements.
for ix in range(target.shape[0]):
row_inds = torch.where(x_bucket_indices == ix)[0]
for iy in range(target.shape[1]):
col_inds = torch.where(y_bucket_indices == iy)[0]
cell_inds = intersection(row_inds, col_inds)
# Collect the color of the points that fall within this cell
target[ix, iy, :] = pool_color(colors[cell_inds, :])
(Utilizing these two helper functions)
# Define how to "pool" the colors in each cell of the grid(/bucket)
def pool_color(colors):
if len(colors) == 0: # Interpolate? (Later)
return torch.zeros((3,))
if len(colors) == 1: # Single points single color
return colors[0]
if len(colors) > 1: # Average of all colors in cell
return torch.mean(colors, dim=0)
# Define a helper function to find the intersection of unique vectors
def intersection(x, y):
# x[1, 2, 3, 4], y[3, 4] -> [3, 4]
combined = torch.cat((x, y))
unq, cnt = combined.unique(return_counts=True)
return unq[cnt > 1]
Is there a way to further vectorize the operations within the for
loops such that they are performed on the GPU?
I think one of the biggest issues is not having access to something akin to map
in PyTorch and the generation of a potentially “very” ragged/jagged tensor depending how the pool_color
is applied.
Though the problem itself seems very parallelizable. Provided a set of indices, pull from shared memory the colors, perform pool_color
, and write the result to shared memory in target. I’m just not sure how to implement this any further in torch.