This sounds counterintuitive but let me explain.
I have 2 nn.Modules, let’s call them model
and input_generator
.
input_generator
accepts a data point x
as input, and outputs a vector x_new
. x_new
is in turn passed to the model
, and model
is trained to minimize the cross entropy loss between model(x_new)
and target
. In code:
# assume: input variable 'x', class 'target', optimizer on model.parameters
x_new = input_generator(x)
model_optimizer.zero_grad()
logits = model(x_new)
loss = F.cross_entropy(logits, target)
model_optimizer.step()
So far so good. Now the question is, how is input_generator
trained? Well, I want input_generator
to generate inputs that minimize a different objective for model
. Say we have a completely unrelated input x_other
. I want input_generator
to generate x_new
inputs to minimize the cross-entropy loss between model(x_other)
and target
. Expanding on the previous code snippet:
x_new = input_generator(x)
model_optimizer.zero_grad()
logits = model(x_new)
loss = F.cross_entropy(logits, target)
loss.backward()
model_optimizer.step()
# assume: unrelated input 'x_other', class 'other_target', optimizer on input_generator.parameters
ig_optimizer.zero_grad()
other_logits = model(x_other)
other_loss = F.cross_entropy(other_logits , other_target)
other_loss.backward()
ig_optimizer.step() # this won't work of course
Now, you see the problem. There is no way to optimize input_generator
with my “off-policy” objective, because input_generator
does not play absolutely any role in the creation of x_other
.
So, my dear community, is there a way to say: “Please, input_generator
, generate x_new
in a way that minimizes the cross-entropy loss between model
and a seemingly unrelated input, once model
has taken a gradient step with your generated x_new
.”?
Thank you and all the best.