There are two main problems when creating Generative Adversarial Networks for text:

- The discrete token values are not differentiable after applying argmax.
- The language architecture may generate sequences with different lengths so the discriminator should be able to work with them either with padding or with some representation of the whole sequence.

I was wondering if is it possible to adversarially train a model such as T5. Its decoder produces a sequence with shape `[batch_size, seq_len, model_dim]`

and then it is usually passed through a linear layer to get `[batch_size, seq_len, vocab_size]`

logits. We can apply a softmax across the `vocab_size`

dimension and then these probabilites can go to a discriminator. For the ground truth labels `[batch_size, seq_len]`

we can generate one-hot vectors `[batch_size, seq_len, vocab_size]`

and then apply the softmax as well. This will be sufficient for the discriminator to learn but since we do not apply argmax to the tokens from the generator, gradients should be able to reach it as well.

For the second problem, I was thinking of using a mean function to transform `[batch_size, seq_len, vocab_size]`

probabilities to a “sequence representation” `[batch_size, vocab_size]`

, but I am not sure if this makes sense.

So based on that, is **1** feasible and if not - why? What are some suggestions for solving **2**?