I am struggling with the following situation: I have to train a LSTM to generate series of bank transactions, and to do that I would also like to insert in the LSTM some information about the subject performing the operations. My ultimate goal, after the training, would be to feed the LSTM with a vector containing the info about a subject, possibly a first operation, and then generate a sequence of operations.
Now, my doubt is the following: since the information about the subject is a 1-row tensor, while the sequence of operations (of variable length) is of multiple rows and different features, therefore the two tensors have different shapes. How can they be concatenated together and how should they be fed into the network?
Let’s say I have:
subj_info = torch.tensor([26., 0., 1., 0.]) # tensor containing a bunch of info about the user
operation_series = tensor([[0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[0.0000e+00, 4.6638e-04, 2.2581e-02, 0.0000e+00],
[0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[1.0000e+00, 2.6664e-03, 0.0000e+00, 1.0000e+00],
[0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[1.0000e+00, 1.9997e-03, 0.0000e+00, 1.0000e+00],
[1.0000e+00, 3.4416e-04, 0.0000e+00, 1.0000e+00],
[1.0000e+00, 6.6638e-04, 0.0000e+00, 1.0000e+00],
[1.0000e+00, 9.9972e-04, 0.0000e+00, 1.0000e+00],
[0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00])
# 2D tensor with l operations, each with n features
I would like to concatenate the two tensors to feed them into the LSTM, so that the network learns the sequences but also the info associated to the subject.
I already tried:
torch.cat([subj_info.unsqueeze(0), operation_series.unsqueeze(0)], dim=0)
but it doesn’t work because they have different shapes, not even creating a new dimension and concatenating along that, and neither torch.stack did the trick for me. Am I doing something wrong with the dimensions of the tensors?
For the moment I am concatenating the subj_info to the whole list of operations so my input data is:
input = tensor([[26., 0., 1., 0., 0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[26., 0., 1., 0., 0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[26., 0., 1., 0., 0.0000e+00, 4.6638e-04, 2.2581e-02, 0.0000e+00],
[26., 0., 1., 0., 0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[26., 0., 1., 0., 0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[26., 0., 1., 0., 0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[26., 0., 1., 0., 1.0000e+00, 2.6664e-03, 0.0000e+00, 1.0000e+00],
[26., 0., 1., 0., 0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[26., 0., 1., 0., 1.0000e+00, 1.9997e-03, 0.0000e+00, 1.0000e+00],
[26., 0., 1., 0., 1.0000e+00, 3.4416e-04, 0.0000e+00, 1.0000e+00],
[26., 0., 1., 0., 1.0000e+00, 6.6638e-04, 0.0000e+00, 1.0000e+00],
[26., 0., 1., 0., 1.0000e+00, 9.9972e-04, 0.0000e+00, 1.0000e+00],
[26., 0., 1., 0., 0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00],
[26., 0., 1., 0., 0.0000e+00, 3.3305e-04, 1.6129e-02, 0.0000e+00])
but I don’t think this is optimal, because the LSTM won’t learn correctly the features about the operations.
Moreover, let’s say I manage to concatenate them so that I have something like:
tensor[ [subj_info] ,[ [....]
[....]
[....]
..
[....] ] ]
How should I use such input for the LSTM if I want it to “focus” only on the operation sequence?
Thanks to everyone who can help.