Here is an abstract of this problem:
Assume we have two models named Encoder and Decoder respectively, where Encoder is like:
class Encoder:
def __init__(self, n, dim):
self.embedding_A = nn.Embedding(n, dim) # create an embedding matrix with shape (n, dim)
def forward(self):
out_embeddings = .... # do something with self.embedding_A, and get a new matrix with same shape
return out_embeddings
while the Decoder has a format:
class Decoder:
def __init__(self, n, dim):
self.embedding_B = nn.Embedding(n, dim) # same shape as Encoder's output
def forward(self):
# do something with its self.embedding_B and return a loss for backward
now I want the Decoder’s self.embedding_B to take the output of Encoder (i.e., the out_embeddings), then make an end-to-end optimization from Decoder’s output to Encoder’s self.embedding_A.
I know an easy way is changing the codes and directly using Encoder’s result as an input of Decoder.forward(). However, my Decoder model is too complex, making such a change is difficult.
Is there a possible way to load Encoder’s output (i.e., the out_embeddings) into Decoder’s self.embedding_B and chain two models together? In this way, the backward optimization is expected to start with Decoder’s output, go back to the self.embedding_B, then goes further back to optimize Encoder’s self.embedding_A
I have tried using some codes like
# *encoder* is an instance of class *Encoder*
# *decoder* is an instance of class *Decoder*
out_embedings = encoder(...)
decoder.embeddings_B.data = out_embeddings.data # my 1st try
decoder.embeddings_B = nn.Parameters(out_embeddings) # my 2nd try
but both cannot make the back-optimization reach to encoder.embedding_A. Are there any ways to do that by this direct assigning?