The error is raised, since you are providing a single target value, while the model output contains 512 predictions.
Could you explain your use case a bit, i.e. what should your model try to predict?
If it’s a single prediction per sample, you would have to “reduce” the 512
values somehow.
E.g. if they represent the temporal dimension, you might want to use the last time step (or calculate the mean etc.). On the other hand, if you would like to get a prediction for each time step, your target should also contain labels for all of them.