Gradient descent on particle simulation parameters using pytorch, is it feasible?

I’m not sure if it’s at all possible, but here is what I have done until now.

I’m attempting to do neural style transfer on an image (Neural Transfer Using PyTorch — PyTorch Tutorials 2.4.0+cu121 documentation), however , the parameters are not the image itself but a set of hyperparameters for a particle-based simulation. These hyperparameters are passed to a rust backend that simulates the particle interactions, which returns an image; this image is then used to generate the style loss for optimizing the hyperparameters to the particle simulation.

This is what the class looks like for that:

class ParticleLife(torch.nn.Module):
    def __init__(self, forces, interaction_distances, particle_colors, background_color, n_particles, num_types, image_size, grid_width, simulation_time, device= "cuda:0"):
        super(ParticleLife, self).__init__()
        self.f = torch.nn.Parameter(forces)
        self.in_dist = torch.nn.Parameter(interaction_distances)
        self.pc = torch.nn.Parameter(particle_colors)
        self.bg = torch.nn.Parameter(background_color)
        self.n_p = n_particles
        self.ntypes = num_types
        self.im_size = image_size
        self.g_width  = grid_width
        self.st = simulation_time
        self.device = device
        self.image_arr  = torch.ones((1, image_size[2], image_size[0], image_size[1]))

    def forward(self):
        bg = self.bg.to(int)
        image = particleLifeStyles.renderless_simulate(self.n_p, self.ntypes, self.f.tolist(), self.in_dist.tolist(),
                                                       self.pc.tolist(), self.im_size, self.g_width, bg.tolist(), self.st)
        img_arr = torch.tensor(image, dtype=torch.float32).reshape(
            (self.im_size[1], self.im_size[0], self.im_size[2])).transpose(0, 1).reshape(-1, self.im_size[2], self.im_size[1],
                                                                                   self.im_size[0]) / 255
        img_arr.requires_grad_(True)
        self.image_arr = self.image_arr*0 + img_arr
        return  self.image_arr

and this is what the optimization code looks like:

DEVICE =  "cuda:0"
OPTIM_STEPS = 1000
if __name__ == '__main__':
    """Simulation hyperparameters"""
    num_types = 10
    grid_width = 80
    image_size = [800, 800, 3]

    """Model and base style initialisation"""
    img_base = cv2.imread("knd.jpg")
    img_base = torch.tensor(cv2.resize(img_base,image_size[:2], interpolation=cv2.INTER_CUBIC), dtype=torch.float32).reshape(image_size[2], image_size[1], image_size[0] ).to(DEVICE)[None,...]/255
    cv2.imshow("base",img_base.squeeze().reshape(image_size[0], image_size[1], image_size[2]).cpu().numpy())
    cv2.waitKey(1)
    cnn_normalization_mean = torch.tensor([0.485, 0.456, 0.406])
    cnn_normalization_std = torch.tensor([0.229, 0.224, 0.225])
    cnn = vgg19(VGG19_Weights.DEFAULT).features.eval()
    model, style_losses = utils.get_style_model_and_losses(cnn, cnn_normalization_mean, cnn_normalization_std, img_base, ['conv_1', 'conv_2', 'conv_3', 'conv_4', 'conv_5'])

    model.eval().to(DEVICE)
    model.requires_grad_(False)


    """Simulation parameters to optimize"""
    forces = torch.tensor([[(random.random() -0.5)*2 for _ in range(num_types)] for _ in range(num_types)], dtype=torch.float32 ,device=DEVICE, requires_grad=True)
    color_mat = torch.tensor(utils.num_types_to_rgb(num_types),dtype=torch.float32, device=DEVICE, requires_grad=True)
    background_color = torch.tensor([0.0,0.0,0.0],dtype=torch.float32,  device=DEVICE, requires_grad=True)
    attraction_field_mat = torch.tensor([[random.uniform(grid_width / 2, grid_width) for _ in range(num_types)] for _ in range(num_types)], dtype=torch.float32,device=DEVICE, requires_grad=True)

    p_life_model = utils.ParticleLife(forces, attraction_field_mat, color_mat, background_color, 5000, num_types,image_size, grid_width, 500)

    """Optimizer"""

    optimizer  = torch.optim.LBFGS(p_life_model.parameters())
    for i in range(OPTIM_STEPS):
        def closure():
            optimizer.zero_grad()
            img_arr = p_life_model()
            model(img_arr)
            loss = 0
            for sl in style_losses:
                loss += sl.loss*1000
            loss.backward(retain_graph = True)
            print(f"Simulation step {i} loss is {loss.item()}")
            if i % 20 == 0:
                cv2.imshow("image", img_arr.detach().cpu().squeeze().reshape(image_size).numpy())
                cv2.waitKey(1)

            return loss
        optimizer.step(closure)

Obviously, without the rust backend, no one else can run the code. It does not throw any errors, but the parameters passed to the optimizer are not being changed in any way. I don’t know enough about autograd to know what is happening, though I assume that somewhere in the ParticleLife class the parameters to be optimized are being disconnected from the graph.

The optimizer works, if I just use an image as the parameters for the optimizer, that image eventually gets the style transferred to it.

Would there be a way for this to work, or is this simply not doable in pytorch?

Backpropagation involves going back through the computational graph using backward methods that are already defined for most pytorch operations. However it is not defined for your particle-based simulation. The backward method actually defines the gradient formula, which you would have to derive (e.g. we know that gradient for x^2 is 2*x, etc). Otherwise pytorch has no way of knowing how to backpropagate through it.

You can try to do the particle based simulation fully in pytorch, in that case it will be possible to auto-differentiate through it as long as you use only differentiable operations that have backward, and afaik all pytorch operators that are differentiable have it, but keep in mind not operators are differentiable in principle. If you need to backpropagate through rust code, you can try this GitHub - EnzymeAD/rust: A rust fork to work towards Enzyme integration

I see. Could I not make my own autograd class and define the backward gradients?

Something like this:

class PLifeSim(torch.autograd.Function):

    @staticmethod
    def forward(ctx, n_p, ntypes, f, in_dist, pc, im_size, g_width, bg, st):
        img = particleLifeStyles.renderless_simulate(n_p, ntypes, f, in_dist, pc, im_size, g_width, bg, st)
        img_arr = torch.tensor(img, dtype=torch.float32).reshape(
            (im_size[1], im_size[0], im_size[2])).transpose(0, 1).reshape(-1, im_size[2],
                                                                                         im_size[1],
                                                                                         im_size[0]) / 255
        return img_arr

    @staticmethod
    def backward(ctx, grad_output):
        grad_input = grad_output.clone()
        grad_input[...] = 1.0
        return grad_input

However, I guess it’s near impossible to calculate gradients for a particle simulation, as the image is just the particle position, which is generated randomly initially when calling the forward pass and simulated for n steps to determine their final position in the image.

I guess ill use a gradient-free method.

Anyway, thanks a lot for the quick response!

Expanding upon @qq-me’s reply, you could define your own torch.autograd.Function object, but you’d have to manually define the derivatives (as you said) of the entire Rust simulation, which would be incredibly difficult to do.

It’d be best to try and see if you can write an equivalent in torch primitives and let autograd calculate the gradients itself.

If you want, you could try doing finite differences, but that’ll be quite noisy probably.

I’ve already written a sim using only pytorch, I think the main problem is the generating the image based on the particle position and finding the gradients for the generation of the image and the fact that the graph size gets excessively large with anything above 500 particles (I run out of GPU memory) . Additionally I’m not sure finite difference will work since the simulation is extremely non linear, there are conditions like maximum particle neighbourhood counts and the such.

Again, thank you for the feedback. I think ill just use a evolutionary algorithm as a gradient free method instead.