My data is of shape
batch_size * num_paths * num_edges * emb_dim. Here:
batch_size: refers to batch size (32)
num_paths: refers to the no of paths between two given terms in a dependency tree
num_edges: refers to the no of edges in a specific path
emb_dim: refers to the embedding dimension (300)
num_edges are variable for every training sample. In other words, for every training sample, there may be variable number of paths, each with variable number of edges.
data is an n-d list of lists in native Python, since two dimensions (
num_edges are variable)
I want to pass each path of each training instance as input to an LSTM, since a path is a sequence of edges and I want the resultant path representation after passing it through the LSTM. After obtaining the representation for each path in a training example, I want to take the sum of all these representations.
I know I can pack variable number of edges through
pack_padded_sequence. But what about variable number of paths? How do I account for that? Is there any way to do it in native Pytorch, without going for messy solutions like iterating through loops?
Any help would be greatly appreciated!