Is nn.utils.vector_to_parameters() not differentiable?

Thanks, this solved the problem!

I also found another solution (with maybe a little less overhead) here Hypernetwork implementation - #5 by ID56 .
Inserting the weights and biases manually via torch.nn.functional.linear also works!