Hi, I have a nasty for loop that I am trying to get around in my implementation:
Given:
A
- a list of N vectors
B
- a list of N matrices
The outer dimension of each vector-matrix pair is same (and is equal to 1
& M
). But the inner dimension is different. I want to multiplying each pair and stack the resultant vectors so that I get a matrix C of size [N,M]:
C = torch.zeros(0,M)
for i in range(N):
Ci = torch.mm(A[i],B[i])
C = torch.cat((C,Ci),0)
I was wondering if it was possible to get rid of the for loop and somehow parallelise the computation for C
.
One way to address the issue would be to get a matrix using the list of vectors A
by padding zeros:
P = sum of the size of inner dimension over all pairs
ABatch = torch.zeros(0,P)
pos = 0
for i in range(N):
AiTemp = torch.cat((torch.zeros(1,pos),A[i]),1)
pos += A[i].size()[1]
Ai = torch.cat((AiTemp,torch.zeros(1,P-pos)),1)
ABatch = torch.cat((ABatch, Ai),0)
and concatenate the each matrix in the list B
along the first dimension to get BBatch
of size [P,M]
This approach helps because I can avoid the for loop in forward prop once I have obtained ABatch
[N,P]
and BBatch
[P,M]
:
C = torch.mm(ABatch, BBatch)
However it would be great if there is an alternative that I can use to avoid having unnecessarily large tensor ABatch
.