Remove connections between layers

pypol · June 17, 2018, 9:15pm

I m trying to freeze 200 random connections between the second fully connected layer(FC2 with 200 neurons ) and my third fully connected layer(FC3 , with 200 neurons) and was wondering if the following code i wrote is correct. my intention is to pick a random row then pick a random column and detach it from the computation graph.


frozen_rows = []
def remove_random_connections():    
       
    for child_number, child in enumerate(net.children()):      
   
        if child_number == 1: # second layer            
            for j in range(1000):                   
                i = random.randint(0,199) 
                k = random.randint(0,199)
                if i not in frozen_rows:
                    frozen_rows.append(i)                
                    params = list(child.parameters()) 
                    params[0].grad.data[i][k] = 0.0

tom · June 18, 2018, 9:43am

This is almost certainly algorithmically not quite ideal. Some comments

I would recommend using net[1] (if your model is nn.Sequential) or net.name_of_your_layer instead of enumerating the children and picking one. Similar for the parameters (you want .weight?).
If you want to do something in all 200 rows (mind you randint is exclusive the upper limit), you can just enumerate over them in the rows and the pick columns at random. You can use advanced indexing to speed this up. (I use torch.no_grad instead of .data, because I got an error when using .data with indexing and assignment.) Here is a toy example with a random matrix instead of net[1].weight.grad:

a = torch.randn(5,5)
print(a)
i = torch.arange(5)
j = torch.randint_like(i, 0, 5)
with torch.no_grad():
  a[i,j] = 0
print(a)

If you want to have distinct rows, you can use torch.randperm for j instead.

Best regards

Thomas

pypol · June 18, 2018, 8:24pm

Thank you for your help

iAvicenna · June 18, 2018, 9:24pm

Also you it does not seem like you are removing any connections so the name of the function and the title of the topic is confusing. You are freezing connections where as removing would be understood as setting the weights to zero not the gradients.