Transformer tgt during prediction

hello everyone, hope all is well

i am using transformers in audio processing where i try to generate instrumentals from a mixture audio track (music source seperation), during the training/validation it makes sense to pass the instrument track as the target and the mixture as the source, but during actual prediction, the target is still required which is where i am confused. i’ve read answers on the forums here but im still not quite sure as what to do.

thanks in advance for the help.