How to deal with different variable length and real numbers

Hi, so I have many long documents. The label is gender. For each sentence of document, I map it into an embedding space. So I have a [num_of_sentences, 768] vector for each doc. They could be [1, 768], could be [3000, 768]. The average is [1000, 768]

My question is if I want to feed this vector into an attention network using LSTM, etc, how do I pad it since the format is (seq_len, batch, input_size) while the length is extremely different and they are real numbers?