Samples selection in transfer learning

Alessandro_Paticchio · January 19, 2020, 3:06pm

Hello, I am working on transfer learning, trying to reduce the number of samples to use in the target training set.
To do so, I am reimplementing the following algorithm, taken from the paper “Instance Based Deep Transfer Learning”, 2018, Wang et Al.:

Basically, for each training sample of the target, you compute the jacobian (gradient) of the loss w.r.t. all the parameters and multiply it by the Hessian of the loss function (averaged for all the samples of the training) and by all the gradients of the loss when the net is fed with a validation sample, accumulating them. If the final result is positive, you discard that sample.

My implementation is as follows:

def instance_selection(model, X_train, y_train, X_valid, y_valid, criterion):
    # Computation of Hessian
    print('Computing Hessian matrix...', end='')
    hessian_matrix = compute_hessian(model, X_train, y_train, criterion)
    print('done!')

    selected_indices = []


    for i, train_sample in enumerate(tqdm(X_train, desc='Instance selection')):
        # Computing jacobian of the loss in the training sample
        model.zero_grad()
        train_sample = np.expand_dims(train_sample, axis=0)
        train_sample = torch.Tensor(train_sample)
        label = y_train[i]
        output = model(train_sample)
        label = torch.LongTensor([label])
        loss = criterion(output, label)
        # This step will populate the parameters of the network with the gradients
        loss.backward()

        # Iteration on the parameters of the network to get all the gradients
        jacobian_i = []
        for param in model.parameters():
            jacobian_i.append(param.grad.view(-1).cpu().data.numpy())
        jacobian_i = np.concatenate(jacobian_i).ravel()
        
        # Computing intermediate product between Hessian and Jacobian of training 
        sample
        intermediate = np.matmul(hessian_matrix, np.transpose(jacobian_i))
        j_loss = 0

        for j, valid_sample in enumerate(X_valid):
            # Computing jacobian of the loss in the validation sample
            model.zero_grad()
            valid_sample = np.expand_dims(valid_sample, axis=0)
            valid_sample = torch.Tensor(valid_sample)
            label = y_valid[j]
            output = model(valid_sample)
            label = torch.LongTensor([label])
            loss = criterion(output, label)
            loss.backward()

            jacobian_j = []
            for param in model.parameters():
                jacobian_j.append(param.grad.view(-1).cpu().data.numpy())
            jacobian_j = np.concatenate(jacobian_j).ravel()
            
            # Computing the multiplication between the jacobian of the validation sample
            and the intermediate matrix
            j_loss += np.matmul((jacobian_j*(-1)), intermediate)
        
        # If j_loss is negative, I won't keep the sample in the training set
        if j_loss <= 0:
            selected_indices.append(i)

    return selected_indices

Is there anyone who has already done a similar thing or can give me a feedback whether the implementation is correct?
Thanks a lot!