Training a neural network to minimize a black-box cost function

I have a black box function g(p,q). It is evaluated using an external stimulator. For any given p , we want to find q that minimizes g(p,q). I want to model f(p) that generates the optimal values of q from p. I want to model f(p) as a neural network.
I am looking for directions on how to design the training process.
Both p and q are vectors with continuous values.

The closest training process I could think of was the cross-entropy method used in reinforcement learning. But, I think there could be a better way.