Lets define:
a and b have shapes (batch, sequence) with different sizes.
x = Em(a)
y = Em(b)
Is this operation keep gradients?
x[bool_mask1] += y[bool_mask2]
Lets define:
a and b have shapes (batch, sequence) with different sizes.
x = Em(a)
y = Em(b)
Is this operation keep gradients?
x[bool_mask1] += y[bool_mask2]