Greetings from Italy!
So, I coded a custom loss function to train my autoencoder. The problem is the following:
the function i’m using was coded by someone else, and, for my purposes, it needs to work on vectors.
For instance, if I have a dataset
n x m where
n is the number of samples, and
m is the number of features, I need the function to work on every possible combination of features: so I’ll pass to the function the features number 0 and 1, 0 and 2 and so on.
So I did write a double for loop to see if the funtion would have worked inside torch. The problem is that the double for loop is being a huge bottleneck for my model, and the training is really slow.
I tried to optimize the double for loop + function section, to vectorialize it, but I wans’t able to.
This is the code:
def rbf_dot(self, pattern1, pattern2, deg): size1 = pattern1.size() size2 = pattern2.size() G = torch.sum(pattern1 * pattern1, axis = 1).reshape(size1, 1) H = torch.sum(pattern2 * pattern2, axis = 1).reshape(size2, 1) Q = torch.tile(G, (1, size2)) R = torch.tile(torch.transpose(H, 0, 1), (size1, 1)) H = Q + R - 2 * torch.matmul(pattern1, torch.transpose(pattern2, 0, 1)) H = torch.exp(-(H/2) / (deg**2)) return H def function(self, X, Y): """ X, Y : torch tensor of size [n, 1] n : number of observations """ n = X.size(0) # ----- width of X ----- Xmed = X G = torch.sum(Xmed * Xmed, axis = 1).reshape(n, 1) Q = torch.tile(G, (1, n)) R = torch.tile(torch.transpose(G, 0, 1), (n, 1)) dists = Q + R - 2 * torch.matmul(Xmed, torch.transpose(Xmed, 0, 1)) dists = dists - torch.tril(dists) dists = dists.reshape(n**2, 1) width_x = torch.sqrt( 0.5 * torch.median(dists[dists>0])) # ----- width of Y ----- Ymed = Y G = torch.sum(Ymed * Ymed, axis = 1).reshape(n, 1) Q = torch.tile(G, (1, n)) R = torch.tile(torch.transpose(G, 0, 1), (n, 1)) dists = Q + R - 2 * torch.matmul(Ymed, torch.transpose(Ymed, 0, 1)) dists = dists - torch.tril(dists) dists = dists.reshape(n**2, 1) width_y = torch.sqrt( 0.5 * torch.median(dists[dists>0])) # ----- ----- H = torch.eye(n, device = dists.device) - ( torch.ones(n,n, device = dists.device) / n ) K = self.rbf_dot(X, X, width_x) L = self.rbf_dot(Y, Y, width_y) Kc = torch.matmul(torch.matmul(H, K), H) Lc = torch.matmul(torch.matmul(H, L), H) testStat = torch.sum(torch.transpose(Kc, 0, 1) * Lc) / n return testStat def double_for_loop_section(self, data): # n : rows, observations # m : columns, features # so that: a row represents all the features of a single user; # a column represents the same feature for all users n, m = data.size() stat = torch.zeros(m, m, device = data.device) # self.device for i in range(m): for j in range(i, m): if i != j: x = data[:, i] y = data[:, j] x_2D = x[:, None] y_2D = y[:, None] stat[i, j] = self.function(x_2D, y_2D) # matrix is symmetric stat[j, i] = stat[i, j] return stat
double_for_loop section is called inside the
loss function method, passing the hidden layer output of the encoder.
Could someone help me to vectorialize it?