First time access torch.tensor too slow (strange thing)

Update : My question was too long, so I was finding a way to explain it shortly:D, ty for accept my question

Hi, I was training my model but I noticed something strange, I rewrited my code for easy look, look at these code below :

matched = torch.LongTensor(8732).to("cuda")
for epoch in range(config.start_epoch, config.end_epoch):
    for imgs, targets in dataloader:
        
        imgs = imgs.to(config.device)
        targets = [target.to(config.device) for target in targets]

        with torch.set_grad_enabled(False):
            loc, conf = net(imgs)  

            for i in range(5):
                t = time.time()
                matched[0] = -1
                print(time.time() - t)

        print("finished 1 iteration!")

I created matched tensor and pass image into my neural network, after that, I measure how much time it take to assign just 1 element in matched :

finished 1 iteration!
0.20601940155029297
0.0
0.0
0.0
0.006075859069824219
finished 1 iteration!
0.21062469482421875
0.0
0.0
0.0
0.0
finished 1 iteration!

You can see that the first time I assign matched[0] take so much time, but when I don’t pass images into neural network, it take fewer time :

matched = torch.LongTensor(8732).to("cuda")
for epoch in range(config.start_epoch, config.end_epoch):
    for imgs, targets in dataloader:
        
        imgs = imgs.to(config.device)
        targets = [target.to(config.device) for target in targets]

        with torch.set_grad_enabled(False):
            #loc, conf = net(imgs)  not pass images in to network

            for i in range(5):
                t = time.time()
                matched[0] = -1
                print(time.time() - t)

        print("finished 1 iteration!")
   

the result is :

0.0
0.0
0.0010001659393310547
0.0
0.0
finished 1 iteration!
0.0
0.0
0.0
0.0
0.0
finished 1 iteration!

I don’t understand why, please let me know the reason

// config.device = “cuda” in my code