How to apply the pretrained model on testing data and get predictions for each instance?

Dear All,
I am new to python and Pytorch. Don’t have a background in Mathematics. Recently in a task of predicting four scores for a pair of sentences through regression, I tried to implement it with Pytorch. Now my problems are about testing after training steps.
Question 1: When testing, how am I going to do in order to get prediction for each testing instance while the testing is also running with batches and epochs? During the training, I used MSELoss to measure the model performance and it is working well to report the loss for each epochs. However, I need to obtain the correlation (say, pearson, spearman, etc) score with human annotators. I was puzzled by this issue as the total testing instance is around 500, but on testing, the model does extactly 10 epochs as it did in the training with batches, which are not the total number of testing instances until it finishes.
Question 2: Is there a quicker way to run testing with just one go? I mean, by any chance, can we do the similar cross-validated with conventional approach (not deep learning)?
Attached below is part of my testing module and the main function (those commented lines are what I am trying to do for correlation but failed ):

Many thanks. Your timely help and friendly input will be much appreciated.

def test(model, vocab, args):
data_loader = DataLoader(vocab, args.test_file)
criterion = nn.MSELoss()
loss = [0., 0., 0. ,0.]
tot_size = 0
#test_orgs = [[],[],[],[]]
#test_preds = [[],[],[],[]]
for input, target, score in data_loader.get_batches(args.batch_size, shuffle = False):
batch_size = len(input)
tot_size += batch_size
input = Variable(torch.LongTensor(input))
target = Variable(torch.LongTensor(target))
golden = Variable(torch.FloatTensor(score))
if use_cuda:
input = input.cuda()
target = target.cuda()
golden = golden.cuda()
preds = model(input, target, batch_size)
for i, (pred, max_score) in enumerate(zip(preds, [35, 25, 25, 15])):
loss[i] += batch_size * criterion(predmax_score, golden[:, i]).data[0]
max_score) #, golden[:, i]
result = [ l/tot_size for l in loss ]
print tot_size’[Test MSE]: Usefulness: %5.3f; Terminology: $%5.3f; Idiomatic Writing: %5.3f; Target Mechanics: %5.3f ’ %(result[0], result[1], result[2], result[3]))
return result #,test_orgs,

def main(args):
vocab = Vocab(args.train_file, args.src_emb, args.tgt_emb)

model = AttentionRegression(vocab, args.emb_size, args.feature_size, args.window_size, args.dropout, args.hidden_size, args.n_layers, args.attention_size)

if use_cuda:
# starting training, comment this line if you are loading a pretrained model
#train(model, vocab, args)

# or load a pretrained model, comment this if you are training a new model
model = torch.load('', map_location=lambda storage, loc: storage)
# model.test()
test(model, vocab, args)
#test_orgs, test_preds = test(model,vocab, args)[1], test(model,vocab,args)[2] (where I attempted to try but no luck)'---------------------start evaluating on test data-----------------------')'[Model Hyperparameters: ] \nEmbedding Size: %5d; Feature Size: %5d; Window Size: %3d; Dropout: %5.3f; Hidden Layer Size: %5d; Layers: %3d; Attention Size: %3d'%(args.emb_size, args.featur

What is data_loader.get_batches doing?
Usually you would iterate over your data using:

for batch_idx, (data, target) in enumerate(data_loader):

Is it an own implementation or which version are your using?

Hi, it’s an own implementation:
Here is the code:
lass DataLoader(object):
def init(self, vocab, fname):
self.src_data = []
self.tgt_data = []
self.scores =[]
self.vocab = vocab
with,‘r’,‘utf-8’) as f:
for line in f.readlines():
info = line.strip().split(’\t’)
assert len(info) == 6, line
src = info[0].split()
tgt = info[1].split()
scores = [ float(x) for x in info[2:]]

def get_batches(self, batch_size, shuffle = True):
    idx = list(range(len(self.src_data)))
    if shuffle:
    cur_size = 0
    input, target, score = [], [], []
    for _id in sorted(idx, key = lambda x: len(self.src_data[x])):
        cur_size += len(self.src_data[_id])
        if cur_size  >= batch_size:
            cur_size  = 0
            seq_len = max(len(t) for t in input)
            input = [ self.vocab.src2id(t)+ [0]*(seq_len - len(t)) for t in input ]
            seq_len = max(len(t) for t in target)
            target = [ self.vocab.tgt2id(t) + [0]*(seq_len - len(t)) for t in target ]
            yield input, target, score

Ok, I would suggest to overload the Dataset class for your own implementation of loading and preprocessing the data.
Since you are creating your own Dataloader, you are losing the functionality of the built-in DataLoader, which uses multiprocessing, shuffling etc.

I couldn’t follow your code completely, but have a look at the Data Loading and Processing Tutorial, which gives you all necessary information how to re-implement your code.

To your first question:
If your are using the PyTorch DataLoader, just specify shuffle=False iterate your test set. The batch_size can be > 1, but you would want to append the outputs in a list.
Your model should not use more than one epoch on the test set, because it will just repeat the predictions.


I have a similar problem where I have manually passed batches(size=128) of sentences to my model for training and testing and now am not sure how to use my model to evaluate a single sentence now. Am converting each sentence to a image like representation and convolving over each.

My convolution layer was taking batches in the following sequence:

1 - Batch size x Length of sentence - 128 x 9 (number of words)
2 - Batch size x Length of sentence x Embedding size (each word is embedded to dim = 50)
3 - Unsqueeze at dim = 1 to accommodate channels
4 - Finally the cnn layer 1 sees data as 128 x 1 x 9 x 50 and am making it extract an output of dimensions 128 x 9 x 9 —> 9x9 outputs for a batch of 128

5- Did the same for testing where the model made predictions over unseen data of the same batch size.
But, for the sake of user-interfacing i want it to give me a similar output for a single sentence instead of a batch size of 128. The problem, my model is expecting inputs in batches of 128 and outputs accordingly.

What could be the smartest possible/minimal change that my code needs to go through?

Apologies upfront, am a beginner and I’ve just learnt creating models but, I’ve never tested one before.

The model should not be dependent on the batch size and should take any batch size.
It’s just important to add a batch dimension even to a single sample.
If that’s not the case, could you post your model and training code?

1 Like


Gear up to witness some very poorly written lines of code :stuck_out_tongue:
Here’s my model:

class CNN_NER(nn.Module): 
    def __init__(self,vocab_size,embedding_size):
        # vocab_size, embedding_size, window_size, hidden_size, output_size

        super(CNN_NER, self).__init__()
        self.embed = nn.Embedding(vocab_size, embedding_size)
        self.cnn1 = nn.Conv2d(in_channels=1,padding=2,out_channels=120,kernel_size=(11,54),stride=1,dilation=1)
        self.cnn2 = nn.Conv2d(in_channels=1,padding=2,out_channels=120,kernel_size=(10,54),stride=1,dilation=1)
        self.cnn3 = nn.Conv2d(in_channels=1,padding=2,out_channels=120,kernel_size=(9,54),stride=1,dilation=1)
#         self.cnn4 = nn.Conv2d(in_channels=100,padding=1,out_channels=100,kernel_size=(1,3),stride=1,dilation=1)
#         self.cnn5 = nn.Conv2d(in_channels=100,padding=1,out_channels=100,kernel_size=(1,4),stride=1,dilation=1)
#         self.cnn6 = nn.Conv2d(in_channels=100,padding=1,out_channels=100,kernel_size=(1,6),stride=1,dilation=1)
        self.relu = nn.ReLU()
        self.maxpool3 = nn.MaxPool2d(kernel_size=(1,3))
        self.maxpool4 = nn.MaxPool2d(kernel_size=(1,4))
        self.maxpool5 = nn.MaxPool2d(kernel_size=(1,5))
        self.linear = nn.Linear(40,9)
        self.softmax = nn.LogSoftmax()
        self.dropout = nn.Dropout(0.2)
    def forward(self,sent_grams,is_training):
        embeds = self.embed(sent_grams)
        embeds = embeds.unsqueeze(1)
        l1 = self.cnn1(embeds)
        l1 = self.relu(l1)
        l1 = l1.squeeze(3)
        l1 = self.maxpool3(l1)
        l2 = self.cnn2(embeds)
        l2 = self.relu(l2)
        l2 = l2.squeeze(3)
        l2 = self.maxpool4(l2)
        l3 = self.cnn3(embeds)
        l3 = self.relu(l3)
        l3 = l3.squeeze(3)
        l3 = self.maxpool5(l3)
        l4 =,l2,l3),1)
        l4 = l4.view(l4.size(0),9,40)
        #print('quished before linear layer',l4.size())
        l5 = self.linear(l4)
        #l5 = self.softmax(l5)
        return l5

The loss and optimizers are:

'''define loss and optimizer'''
loss_function = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)


now = time.time()
for epoch in range(EPOCH):
    losses = []
    for i,batch in enumerate(getBatch(BATCH_SIZE, train_data)):
        if len(x) == 128:
            inputs =[prepare_sequence(sent, word2index).view(1, -1) for sent in x])
            targets =[prepare_tag(tag, tag2index).view(1,-1) for tag in y])
            #print('input size',inputs.size())
            #print('target size',targets.size())
            preds = model(inputs, is_training=True)
            #print('preds,targets before loss',preds.size(),targets.size())
            loss = loss_function(preds, targets)
            if i % 1000 == 0:
                total = 0.0
                correct = 0.0
                for i,batch in enumerate(getBatch(128, test_data)):
                    inputs =[prepare_sequence(sent, word2index).view(1, -1) for sent in x])
                    targets =[prepare_tag(tag, tag2index).view(1,-1) for tag in y])
                    preds = model(inputs, is_training=False)
                    out = torch.max(preds,1)[1]
                    total += targets.size(0)*targets.size(1)
                    correct += (out.cpu()==targets.cpu()).sum()
                    acc = correct.double()/total * 100
                print("[%d/%d] mean_loss : %0.2f accuracy : %0.2f" %(epoch, EPOCH, np.mean(losses),acc))
                losses = []
print('time taken in seconds: {} seconds'.format(time.time()-now))


prepare_sequence returns a sequence of word indices in my vocab against a sequence of words
prepare_sequence([‘EU’, ‘rejects’, ‘German’, ‘call’, ‘to’, ‘boycott’, ‘British’, ‘lamb’, ‘.’], word2index)

tensor([ 4029, 6314, 6213, 4033, 15398, 2473, 14625, 9766, 19587],

prepare_tag([‘O\n’, ‘O\n’, ‘B-ORG\n’, ‘O\n’, ‘B-MISC\n’],tag2index)
tensor([8, 8, 5, 8, 6], device=‘cuda:0’)

We concatenate 128 such sentences to embed and pass into our model along with the ‘tags’ or labels…for training. Each word changes into a 50 dimensional vector post embedding.
Same for testing.

you were right. I was fretting unnecessarily. Request not to waste your time on this stupid question of mine.
But, thanks a lot for your prompt response!