LSTM feature importance

I have a model trained on 16 features, seq_len of 120 and in batches of 256.

I would like to test the loss on the model on a testset, with random sampling from a normal distribution for one features at a time so I can measure how important each features is (important features would give a high rise in loss when its random sampled)

Any advice on how to manipulate one feature at a time? the shape of the input is (256, 120, 16)

h = model.init_hidden(batch_size)
    with torch.no_grad():
        for inp, labels in loader:
                inp, labels = inp.cuda(), labels.cuda()

            outputs, h = model(inp, h)
            _, predicts = torch.max(outputs, 1)

Try inp[ : , : , F].normal_() wherer F is the feature u want to manipulate.

A sample code

for curFeature in range(X.shape[1]):     
    print("Size of x1 is {0}".format(X.shape))
    # Alter the feature before you calculate the loss
    Y[:,curFeature]=np.random.normal(mu, sigma, 100)
    print("The loss value for feature {0} is {1}".format(curFeature,loss.item()))

Hope this helps

1 Like

Thanks alot

I went with the following. looping over features on the outside of the function in order to log evry feature’s performance

def accuracy_test(model, loader, feature): 

    batch_size = loader.batch_size
    mu = 0
    sigma = 0.01
    train_on_gpu = (True if torch.cuda.is_available() else False)
    print("Training on gpu :{}".format(train_on_gpu))
    losses = []

    criterion = nn.CrossEntropyLoss()
    h = model.init_hidden(batch_size)
    with torch.no_grad():
        for inp, labels in loader:
            perbutated_inp = copy.deepcopy(inp)
            perbutated_inp[ : , : , feature] = torch.HalfTensor(np.random.normal(mu, sigma, inp.shape[1]))
            inp = perbutated_inp
                inp, labels = inp.cuda(), labels.cuda()
            inp = inp.reshape(inp.shape[1], inp.shape[0], inp.shape[2])
            outputs, h = model(inp.float(), h)       
            loss = criterion(outputs, labels.long())
    test_loss = np.mean(losses)
    return test_loss