Unable to use iter and next on dataloader to process all batches

I was experimenting with the iter and next functionality to iterate through my dataloader.

When I train the model using iter,it seems that I am only processing one batch of the trainloader
as shown in the below log

torch.Size([16, 3, 512, 512])
Epoch : 1 Train Loss : 0.001343 
torch.Size([16, 3, 512, 512])
Epoch : 2 Train Loss : 0.002819 
torch.Size([16, 3, 512, 512])
Epoch : 3 Train Loss : 0.004004 
torch.Size([16, 3, 512, 512])
Epoch : 4 Train Loss : 0.005313 
torch.Size([16, 3, 512, 512])
Epoch : 5 Train Loss : 0.006345 
torch.Size([16, 3, 512, 512])
Epoch : 6 Train Loss : 0.007257 
torch.Size([16, 3, 512, 512])
Epoch : 7 Train Loss : 0.008262 
torch.Size([16, 3, 512, 512])
Epoch : 8 Train Loss : 0.009080 
torch.Size([16, 3, 512, 512])
Epoch : 9 Train Loss : 0.010034 
torch.Size([16, 3, 512, 512])
Epoch : 10 Train Loss : 0.011135 

My training code is as follows

epochs_a=10
criterion=nn.L1Loss()
optimizer=torch.optim.Adam(model.parameters(),lr = lr)
iter_source=iter(train_loader)
train_loss=0.0
for i in range(epochs_a):
  model.train()
  optimizer.zero_grad()
  images=iter_source.next()
  image=images[0].to(device)
  print(image.shape)
  labels=images[1].to(device)
  logits=model(image)
  loss=criterion(logits,labels)
  loss.backward()
  optimizer.step()
  train_loss+=loss.item()
  print("Epoch : {} Train Loss : {:.6f} ".format(i+1, train_loss/len(train_loader)))

But when I use a simple enumerate or tqdm to iterate through the trainloader as shown in the below code

criterion=nn.L1Loss()
optimizer=torch.optim.Adam(model.parameters(),lr = lr)
def train_batch_loop(model,trainloader):
        
        train_loss = 0.0
        train_acc = 0.0
       
        

        for images,labels in tqdm(trainloader): 
            
            # move the data to CPU
            images = images.to(device)
            labels = labels.to(device)
            
            logits = model(images)
            loss = criterion(logits,labels)
            
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            
            train_loss += loss.item()
            
            
        return train_loss / len(trainloader)
epochs_a=10
for i in range(epochs_a):
  model.train()
  avg_train_loss=train_batch_loop(model,train_loader)
  print("Epoch : {} Train Loss : {:.6f} ".format(i+1, avg_train_loss))

Then I am able to go through all 400 batches ,and the training log looks like this
train_log

So what is the difference between the two codes,and how can I use iter and next to go through all 400 batches rather than just one single batch

Both approaches should work as seen here:

dataset = TensorDataset(torch.randn(100, 1))
train_loader = DataLoader(dataset, batch_size=10)

iter_source =iter(train_loader)

for i in range(len(dataset)//10):
    images = iter_source.next()
    print('iter {}, shape {}'.format(i, images[0].shape))
  
for i, images in enumerate(train_loader):
    print('iter {}, shape {}'.format(i, images[0].shape))

Thanks but why is the error in the first part different from the error in the second in my code
Is it because it’s only considering one batch at a time?

I don’t know and would need an executable code snippet to reproduce and debug the issue.
Using random data works, so I guess your dataset length is not what you would expect.

the length of train loader is 400,and when i use tqdm it sort of works.Shouldn’t it work for iter also?
Would you like the entire code snippet to reproduce the error

Yes, it should work and also does work as seen in my code snippet.
Yes, please post a minimal and executable code snippet, which would reproduce the issue.

I am actually using a custom dataset which I can’t share. But I can share train loader and other relevant details
In all dataset has 8000 images where each image is of shape 512x512x3 and is associated with a lablel having 3 values [x,y,z]