Extract certain user-defined entities from receipts using Graph Covolution network

sresasi · August 20, 2021, 1:13pm

My task is to extract certain user-defined entities from receipts using Graph Convolution network (https://arxiv.org/pdf/1903.11279.pdf). I’m using ICDAR-2019-SROIE dataset which consists of 625 receipts in total. My user-defined entities are : [Address, Company, Date, Invoice_no, Total, Tax and Other (remaining nodes are considered as other)] Each receipt is considered as a graph with nodes and edges. Each graph is a fully-connected. Each text segment is a node. I have constructed the node features by using countVectorizer().

For example if we have 3 nodes :

nodes =  ["this is an example of first node content ", "this is second node", "this is third node"]

dict = `{"an" : 1, "content" : 2, "example" : 3, "first" : 4 ,"is" : 5, "node" : 6,  "of" : 7,  "second" : 8, "third" : 9, "this" : 10}`

After applying pre-padding the node features looks like:

node 1 = [10, 5, 1, 3, 7, 4, 6, 2]
node 2 = [0, 0, 0, 0, 10, 5, 8, 6]
node 3 = [0, 0, 0, 0, 10, 5, 9, 6]

Labels = {'address': array([1., 0., 0., 0., 0., 0., 0.]), 
          'company': array([0., 1., 0., 0., 0., 0., 0.]), 
          'date': array([0., 0., 1., 0., 0., 0., 0.]),  
          'invoice_no': array([0., 0., 0., 1., 0., 0., 0.]),  
          'other': array([0., 0., 0., 0., 1., 0., 0.]), 
          'tax': array([0., 0., 0., 0., 0., 1., 0.]), 
          'total': array([0., 0., 0., 0., 0., 0., 1.])}.

My class labels are imbalanced [72, 36, 36, 36, 1400, 36, 140]

I have created a small dataset which consists of 1756 nodes from 36 receipts (same layout receipts) and I’ve splittied them into Train, Validation and Test (1400 (from 26 receipts), 148 (from 5 receipts), 208 (from 5 receipts) nodes respectively). In total, there are 337 features. After the padding technique, there are 9 features per node. I’ve created adjacency matrix 1756*1756 in shape which requires for graph convolution while training the model.

I’ve not included any normalization to the features, which I felt not required (I tried row normalization but its of no use).

This is my model :

nfeat = 337
nhid1 = 40
nhid2 = 30
nhid3 = 20
nhid4 = 10
embed = 50
nclass = 7
alpha = 0.2

import torch
import torch.nn as nn
import torch.nn.functional as F

class GAT(nn.Module):
    def __init__(self, nfeat, nhid1, nhid2, nhid3, nhid4, embed, nclass, alpha):
        super(GAT1, self).__init__()
        self.embed1 = nn.Embedding(nfeat, embed)
        self.lstm1 = nn.LSTM(embed, nhid1, num_layers=1, bidirectional=True, batch_first=True)
        self.fc1 = nn.Linear(nhid1 * 2, nhid1)
        self.gc1 = GraphConvolutionLayer(nhid1, nhid2, alpha=alpha)
        self.gc2 = GraphConvolutionLayer(nhid2, nhid3, alpha=alpha)
        self.embed2 = nn.Embedding(nhid3,  nhid4)
        self.lstm2 = nn.LSTM(nhid4, nhid4, num_layers=1, bidirectional=True, batch_first=True) 
        self.fc2 = nn.Linear(nhid4*2, nclass)
        
    def forward(self, x, adj):
        x = self.embed1(x.long())
        x, _ = self.lstm1(x)
        x = x[:, -1, :]
        x = self.fc1(x)
        x = F.elu(self.gc1(x, adj))
        x = F.elu(self.gc2(x, adj))
        x = self.embed2(x.long())
        x, _ = self.lstm2(x)
        x = x[:, -1, :]
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)
        
class GraphConvolutionLayer(nn.Module):
    def __init__(self, in_features, out_features, alpha):
        super(GraphConvolutionLayer, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.alpha = alpha
        self.W = nn.Parameter(torch.empty(size=(in_features, out_features)))
        nn.init.xavier_uniform_(self.W.data, gain=1.414)
        self.a = nn.Parameter(torch.empty(size=(2 * out_features, 1)))
        nn.init.xavier_uniform_(self.a.data, gain=1.414)
        self.leakyrelu = nn.LeakyReLU(self.alpha)
         
    def forward(self, h, adj):
        wh = torch.mm(h, self.W)
        hij = self._concat_features(wh)
        alpha = self.leakyrelu(torch.matmul(hij, self.a).squeeze(2))
        alpha = torch.mm(alpha, adj)
        alphaij = F.softmax(alpha, dim=1)
        ti = torch.matmul(alphaij, wh)
        return ti
          
    def _concat_features(self, h):
        N = h.size()[0]
        h_repeated_in_chunks = h.repeat_interleave(N, dim=0)
        h_repeated_alternating = h.repeat(N, 1)
        all_combinations_matrix = torch.cat([h_repeated_in_chunks, h_repeated_alternating], dim=1)
        return all_combinations_matrix.view(N, N, 2 * self.out_features)


Sample training loop:
---------------------
def train(epoch):
    t = time.time()
    model.train()
    optimizer.zero_grad()
    output = model(features, adj)
    loss_train = F.nll_loss(output[idx_train], labels[idx_train])
    acc_train = accuracy(output[idx_train], labels[idx_train], 0)
    loss_train.backward()
    optimizer.step()

After running 5 epochs, train and validation accuracy is same whether its 10 or 100. However their losses are decreasing from around 2 to 0.8.

True Labels : tensor([1, 0, 0, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 6,
    4, 4, 4, 6, 4, 4, 4, 6, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 3, 2, 4, 4, 4, 4,
    4, 4, 4, 4, 1, 0, 0, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 6, 4, 4, 4, 6, 4, 4, 4, 6, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 3, 4,
    2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 0, 0, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 6, 4, 4, 6, 4, 4, 4, 6, 4, 4, 4, 6, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5,
    3, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 0, 0, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 4, 6, 4, 4, 6, 4, 4, 4, 6, 4, 4, 4, 6, 4, 6, 4, 4, 4, 4, 4, 4,
    4, 5, 3, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
    
Predicted Labels : tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4])

Since 4 (other) which is a dominating class, my model is always predicting 4.

Any inputs from your end to improve my model and achieve my task will be highly appreciable. Thanks a ton for reading this long post and trying to help me.

ptrblck · August 21, 2021, 3:35am

Since you are working with an imbalanced dataset. you could try to use a weighted criterion or check, if a WeightedRandomSampler would be used in order to balance the classes in the drawn batches.

sresasi · August 21, 2021, 12:09pm

Thanks for your response. I’ve tried using WeightedRandomSampler by splitting them into sampler_train, sampler_val and sampler_test.

class_counts = [72, 36, 36, 36, 1400, 36, 140]   # 7-labels count in 36 receipts
num_samples = sum(class_counts)

class_weights = [num_samples/class_counts[i] for i in range(len(class_counts))]
weights = [class_weights[labels[i]] for i in range(int(num_samples))]

train_features, val_features, test_features = features[0:1266], features[1266:1496], features[1496:]
train_labels, val_labels, test_labels = labels[0:1266], labels[1266:1496], labels[1496:]

sampler_train = WeightedRandomSampler(torch.DoubleTensor(weights[0:1266]), int(1266))
sampler_val = WeightedRandomSampler(torch.DoubleTensor(weights[1266:1496]), int(230))
sampler_test = WeightedRandomSampler(torch.DoubleTensor(weights[1496:]), int(260))

train_data = TensorDataset(torch.from_numpy(train_features), torch.from_numpy(train_labels))
val_data = TensorDataset(torch.from_numpy(val_features), torch.from_numpy(val_labels))
test_data = TensorDataset(torch.from_numpy(test_features), torch.from_numpy(test_labels))

train_loader = DataLoader(train_data, batch_size=300, sampler=sampler_train)
val_loader = DataLoader(val_data, batch_size=115, sampler=sampler_val)
test_loader = DataLoader(test_data, batch_size=130, sampler=sampler_test)

However, I didn’t see any change in the model prediction.