Is it possible to create a model which predicts a list of values?
The problem is that the list of values is ambiguous.
The input data is always one numpy array (shape: (1023,256) )
The label data is stored as csv. For each input array there is a csv with rows of those values. But in one case there are 100 rows and in another there are only 90 rows.

So the network gets one array and predicts a list of values.

Sure. There are different types of models that can accomplish this, and the ideal one may depend on whether the data is spatial-temporally related or not. Is the data a time or spatial dependent sequence or is each value orthogonal?

Either way, you should normalize your model inputs and targets.