Batch size in CNN

import time

start_time = time.time()

epochs = 1

train_losses = []

test_losses = []

train_correct = []

test_correct = []

for i in range(epochs):

trn_corr = 0

tst_corr = 0



# Run the training batches

for b, batch in enumerate(train_loader):

    b+=1

    print(b)

    X_train = batch['image']

    y_train = batch['landmarks']

    X_train =X_train.to('cuda',dtype=torch.float)

    y_train = y_train.to('cuda',dtype=torch.float32).squeeze()

    # Apply the model

    

    y_pred = model(X_train)

    y_pred = y_pred.to('cuda',dtype=torch.float32).squeeze()

    loss = criterion(y_pred, y_train)

    print(b,loss.item())

    # Tally the number of correct predictions

    #predicted = torch.max(y_pred.data,1)[0][0]

    #predicted = predicted.to('cuda',dtype=torch.float32).squeeze()

   # batch_corr = (predicted == y_train).sum()

    #trn_corr += batch_corr

    

    # Update parameters

    optimizer.zero_grad()

    loss.backward()

    optimizer.step()

    

    # Print interim results

    if b%100 == 0:

        print(f'epoch: {i:2}  batch: {b:4} [{10*b:6}/50000]  loss: {loss.item():10.8f}')

                

train_losses.append(loss)

print(f’\nDuration: {time.time() - start_time:.0f} seconds’) # print the time elapsed

so when i run it goes till epoch: 0 batch: 700 [ 7000/50000] loss: 835.97753906
701
701 834.0040893554688
702
702 798.3275756835938
703
703 679.1052856445312
704
704 744.6819458007812
705
705 633.1027221679688
706
706 860.9313354492188
707
707 653.2901611328125
708
708 839.19677734375
709
709 762.4442749023438
710
710 733.4808349609375
711
711 831.7756958007812
712
712 768.0433349609375
713
713 533.0064086914062
714
714 744.097900390625
715
715 621.6771240234375
716
716 677.7670288085938
717
717 631.5947265625
718
718 632.5914306640625
719
719 727.1867065429688
720
720 472.8675231933594
721
721 777.7984619140625
722
722 553.052734375
723
723 762.1932983398438
724
724 622.1517944335938
725
725 637.0252075195312
726
726 677.1328125
727
727 926.3015747070312
728
728 671.0360107421875
729
729 1127.0623779296875
730
730 777.3078002929688
731
731 527.7666015625
732
732 576.8959350585938
733
733 821.3842163085938
734
734 571.5703125
735
735 539.9409790039062
736
736 673.8453369140625
737
737 636.5660400390625
738
738 692.1788940429688
739
739 672.10595703125
740
740 914.494140625
741
741 783.8777465820312
742
742 793.4269409179688
743
743 858.7411499023438
744
744 886.5870361328125
745
745 526.9144897460938
746
746 862.1968994140625
747
747 772.9779663085938
748
748 503.57421875
749
749 744.8223876953125
750
750 839.1438598632812
751
751 544.6222534179688
752
752 824.1895751953125
753
753 826.6942749023438
754
754 623.4790649414062
755
755 761.2865600585938
756
756 593.4060668945312
757
757 785.65478515625
758
758 594.8136596679688
759
759 881.4382934570312
760
760 843.9218139648438
761
761 744.2888793945312
762
762 1100.8704833984375
763
763 950.7271728515625
764
764 1016.8292846679688
765
765 752.9169311523438
766
766 848.7357788085938
767
767 692.6583862304688
768
768 564.8028564453125
769
769 151689.140625
770
770 618.965576171875
771
771 443.89501953125
772
772 707.8801879882812
773
773 975.7994384765625
774
774 1512.253173828125
775
775 905.3583374023438
776
776 540.3995361328125
777
777 926.8759765625
778
778 816.6861572265625
779
779 727.527587890625
780
780 1009.4464721679688
781
781 736.8145751953125
782
782 852.1244506835938
783
783 792.7083129882812
784
784 710.3810424804688
785
785 537.8117065429688
786
786 790.4335327148438
787
787 669.735595703125
788
788 746.8652954101562
789
789 817.0357666015625
790
790 601.0955200195312

Duration: 56 seconds
and then it stops why isnt it going further than that

If I understand your code correctly, the duration print statement is used after the training is done.
If that’s the case, the code seems to work fine.

PS: you can add code snippets by wrapping them in three backticks ```, which would make debugging a lot easier. :wink:

1 Like

thanks for the quick reply ,