I want to compute weighted sum on the feature vectors out of two convolutional layers to perform fusion of the features. Let F_1 and F_2 be the tensors of feature maps each of size (batch_size, 256, 4,4). I have to compute F = a_1 * F_1 + a_2 * F_2, where a_k is:
ak = exp(w’_k * f_k)/exp(w’_1*f_1 +w’_2 * f_2)
where w_k is the column scanned vector of W_k( the attention weighting modality for k_th modality), f_k is the column scanned vector for the vector F_k.