How to differ the data from the one batch

Chance · December 24, 2017, 2:29am

I want to deal with the passage to get the summary using the GRU.
However, I have to load one passage once time as the batch (It is slow).
For the different passage, we have to put the sentences from other passage to a new model (I mean not to goes along with the last GRU from the previous sentence). So I can not just input one passage each time.
So how can I deal with this problem? Setting the label on each sentence stands for different passage or record the passage’s sentence？

Thanks a lot.

SimonW · December 24, 2017, 6:32pm

If I’m understanding correctly, you want to do list of sentences in a paragraph -> a summarizing sentence, right?

Are you putting all sentences in one paragraph (passage) as a batch? Why? Doesn’t that break the information flow between sentences?

Sorry I don’t really understand what you mean here. Why do you want a new model for each paragraph? Do you mean same model but different input data?

Could you explain again what the problem is? I don’t really understand the title. Sorry about this.

Chance · December 25, 2017, 12:14pm

yes! That’s what I want to do.

I want to feed several passages at once in a batch(for example, 64 passages) so I can compute many passage parallelly with my GPU.
But I don’t know how to differentiate the sentences from different passages.
For example: sentence1 and sentence2 are from the passage1, sentence3 and sentence4 are from passage2。
(In each sentence we have the words’ embeddings，
so my input dimensions is [batch_size, sentence_length, word_embedding])

[sentence1,
sentence2,
sentence3,
sentence4]

If I put all sentence in one big matrix(Tensor) and Feed to GRU(Because I want to use the embedding of the passage)
Then the GRU will connect them all together in one line.
sentence1 → sentence2 → sentence3 → sentence4

What I want to do is(1、2 and 3、4 are independent):
sentence1 → sentence2 (For passage1)
sentence3 → sentence4 (For passage2)

But I don’t know how, so I have to write a loop to input every passage once time, which is much slow.

Thanks.

SimonW · December 25, 2017, 5:18pm

No. RNN will not connect sequences in a batch together. For connecting sentence1~4 together, you need to concatenate them along the sequence_length dimension.

Chance · December 26, 2017, 2:07am

emmmm…
I get the point.
Thanks.