Just for context, I am working on a nerf-like model for dynamic scenes that, for a given point in space, outputs how occupied the space is at this point in space for all the frames in the recorded scene. That means that, if the recorded scene is a set of 300-frames videos (from different viewpoints) then, doing inference on a given XYZ coordinate will return 300 scalar values
Now, here comes the problem. I need to get the normal vector of the surface at a given point in space. This is usually done by computing the gradient of the occupancy with respect to the coordinates.
This is simple when working with models that for each xyz coordinate produce only one scalar value (models for static scenes), but I can’t come up with a way of doing this for my current scenario.
What I would usually do is something like
normal_vectors=-1*torch.autograd.grad(occupancies.sum(), coordinates)
where occupancies
is a tensor of size [batch_size,300]
and coordinates
is a tensor of size [batch_size,3]
. It produces a tensor of the same size as coordinates
. I would need it to produce a tensor of size [batch_size,300,3]
as, for each item in coordinates
, 300 items are produced and I want the gradient of each one of those with respect to the 3 components of the given item.
The only approach I think may work (untested yet) is something like
normal_vectors=[]
for i in range(300):
normal_vectors.append(-1*torch.autograd.grad(occupancies[:,i].sum(),coordinates))
But this will be painfully slow.
Any alternative approach?