Same Neural Network Output Regardless of Input(s)

christianmcroberts · August 13, 2023, 9:31pm

The Problem: Let G be a complete graph on n vertices – so G has n choose 2 = n(n+1)/2 edges. For each edge in G, I want to use a neural network to assign it a color based on all previous edge colors. There are t colors in use.

The Approach: For this example, G is a complete graph on n = 5 vertices – so there are 10 edges – and we are using t = 3 colors (0,1 and 2). Let input1 be a (10 ,1) tensor of zeros and input2 be a (10,3) tensor of zeros. All edges are indexed, and edge i corresponds to the input1[i,:] = 1 with zeros elsewhere. Input2 tells us which color the edge received, where if edge i is colored color c, then input2[i,c] = 1. We use a for loop to go through all the edges in G until input2 has a single 1 in each row.

The Code:

class BNN(nn.Module):
    ''' Basic Neural Network'''
    def __init__(self, number_of_colors):
        super(BNN,self).__init__()
        self.layer1 = nn.Bilinear(in1_features=1,
                                  in2_features=number_of_colors,
                                  out_features=128
                                 )
        self.relu1 = nn.ReLU()
        self.layer2 = nn.Linear(in_features=128,
                                out_features=number_of_colors
                               )
        self.softmax = nn.Softmax(dim=1)

    def forward(self,edge,coloring):
        
        x = self.layer1(edge,coloring)
        x = self.relu1(x)
        x = self.layer2(x)
        x = self.softmax(x)

        return x

r,s,t = (3,3,3)
size = 10
model = BNN(t)
model.eval()
coloring = torch.zeros(*(size,t), dtype = torch.float)

for i in range(10):
    blank_edge = torch.zeros(*(size,1), dtype = torch.float)
    blank_edge[i] = 1.0

    c = model(blank_edge,coloring)
        
    color = c[i,:].argmax().item()

    coloring[i,color] = 1.0

The Issue: When I run the above code, each edge is always colored the same i.e. input2 has a column full of all 1s. This happens on every run; the only change is which column has all 1s. Initially, I used a linear layer1, but I’ve swapped for a bilinear layer; neither work. I’ve attempted adding a dropout with various probabilities to no avail. I’ve done some printing and it seems the issue starts with before or at the bilinear layer. The code returns the same answer after that layer no matter what input1-input2 combo I use. One thing to note is when I run the code outside of a class declaration, like below:

def test_word(size,number_of_colors):
        
    empty_coloring = torch.tensor(np.zeros((size,number_of_colors)), dtype = torch.float)
    
    for i in range(size):
        edge = torch.tensor(np.zeros((size,1)), dtype = torch.float)
        edge[i] = 1.0

        x = nn.Bilinear(in1_features=1,
                        in2_features=number_of_colors,
                        out_features=64)(edge,empty_coloring)
        x = nn.ReLU()(x)
        x = nn.Linear(in_features=64,
                      out_features=number_of_colors)(x)
        x = nn.Softmax(dim=1)(x)

        color = x[i,:].argmax().item()

        empty_coloring[i,color] = 1.0
        
    return empty_coloring

I get the variability in color chosen. Does anybody know why this is the case or how it can be rectified? I imagine the model is learning off the sparse nature of the inputs. Currently exploring embeddings since my data is incredibly sparse.