ImageFolder return

Hi everyone second post here!
I got this error

TypeError: linear(): argument 'input' (position 1) must be Tensor, not int

This is my code - it runs pretty slow

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

import numpy as np
from PIL import Image
import time

start = time.time()
transform = transforms.Compose([transforms.Resize(255), transforms.CenterCrop(224), transforms.ToTensor(),]) 
# This gives us some transforms to apply later on

training = torchvision.datasets.ImageFolder(root = "training_set", transform = transform)
print(training)

train_dataloader = DataLoader(dataset = training, shuffle = True, batch_size = 32)
print(train_dataloader)




print("Before NN creation",start-time.time())
# Creating the model
class NeuralNetwork (nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.Linear_Stack = nn.Sequential(
            nn.Linear(224*224, 40000), 
            nn.ReLU(),
            nn.Linear(40000, 10000),
            nn.ReLU(),
            nn.Linear(10000, 75),
            nn.ReLU(),
            nn.Linear(75, 5),
            nn.ReLU(),
            nn.Linear(5, 1),
            nn.LogSoftmax(dim=1)
        )
    def forward (self, x):
        logits = self.Linear_Stack(x)
        return logits

model = NeuralNetwork()
print(model)

epochs = 2
lr = 0.001
loss = nn.CrossEntropyLoss()
optim = torch.optim.SGD(model.parameters(), lr = lr)
print(loss, optim)  

#creating the main loop
def main(loader, model, loss, optim):
    size = len(loader.dataset)
    for (images, labels) in enumerate(loader):
        # Prediction
        pred = model.forward(images)
        loss = loss(pred, target)
        #backprop
        optim.zero_grad()
        loss.backward()
        optim.step()
        print(loss)

for x in range(epochs):
    main(train_dataloader, model, loss, optim)
print(start-time.time())

I guess main question is: Am I using Image Folder correctly? (I know it works because I get the correct amount of datapoints when I print it) Is my code even succinct and I couldn’t find anything on what the variable names for the outputs are (like X, y or Sample, Target). I think that answering these questions will help me solve my question, reply if you have anything else that you think will be helpful and I will take it onboard.

enumerate(loader) will give you something like batch_idx, (data, target) back so it is likely you are passing the batch_idx in place of the image. examples/main.py at 01539f9eada34aef67ae7b3d674f30634c35f468 · pytorch/examples · GitHub

Thanks. What is batch_idx ?

enumerate will give you an index for whatever you are iterating over so it is simply the batch number out of the total number of batches in an epoch

Thank you. How can I go about implementing these into my code. Could you edit it for me to review? I am a little confused about batch_idx.

Here is an example of enumerate. You need to add another variable to your loop which is batch index in this case.

Thank you everyone (@eqy, @m3tobom_M ). At the moment, my i5 hangs (it is only just a laptop) with this code. I have ironed out the errors but it just hangs and my computer stops. I ran the FashionMNIST in a reasonable time but this is super slow. Can anyone help me out?

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

import numpy as np
from PIL import Image
import time

start = time.time()
transform = transforms.Compose([transforms.Resize(200*200), transforms.CenterCrop(224), transforms.ToTensor(),]) 
# This gives us some transforms to apply later on

training = torchvision.datasets.ImageFolder(root = "training_set", transform = transform)
print(training)

train_dataloader = DataLoader(dataset = training, shuffle = True, batch_size = 32)
print(train_dataloader)




print("Before NN creation",start-time.time())
# Creating the model
class NeuralNetwork (nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.Linear_Stack = nn.Sequential(
            nn.Linear(200*200, 40000), 
            nn.ReLU(),
            nn.Linear(40000, 10000),
            nn.ReLU(),
            nn.Linear(10000, 75),
            nn.ReLU(),
            nn.Linear(75, 5),
            nn.ReLU(),
            nn.Linear(5, 1),
            nn.LogSoftmax(dim=1)
        )
    def forward (self, x):
        logits = self.Linear_Stack(x)
        return logits

model = NeuralNetwork()
print(model)

epochs = 1
check = 0
lr = 0.001
loss = nn.CrossEntropyLoss()
optim = torch.optim.SGD(model.parameters(), lr = lr)
print(loss, optim)  

#creating the main loop
def main(loader, model, loss, optim):
    size = len(loader.dataset)
    print(size)
    for batch_idx, (data, target) in enumerate(loader):
        # Prediction
        check = check + 1
        print(check)
        pred = model.forward(data)
        loss = loss(pred, target)
        #backprop
        optim.zero_grad()
        loss.backward()
        optim.step()
        print(loss)

for x in range(epochs):
    main(train_dataloader, model, loss, optim)
    print(start-time.time())

Upon adding some print commands, I get stuck after print(check) and before creating the model. The main one is after the print(check) though. That’s where everything just hangs.

I think it’s because your model has wayyyy too many trainable parameters.

Like Layers? How many would be the best to start off?

Number of layers is reasonable, maybe try 2-3. The number of units in the first 2 layers is way too much. It should be in the tens to hundreds at max. You have to try out different values. This is called hyperparameter tuning, which, in essence, is tuning the parameters not learned by the model from the data. Other examples include learning rate, number of layers, etc.

I also suggest maybe going for a smaller dataset so you can train faster :slight_smile:

After that, I suggest looking at an architecture called convolutional neural networks. These have been the goto for computer vision for quite a while, although recent research suggests that may change soon…

Lastly, I suggest using a platform called Google Colab. This is resource that gives you a free GPU to play around with, which will substantially speed up your experiments. It’s in a notebook format, where you can type and run your code piece-by-piece.

Okie. I have tried Colab, its a pain to upload all my data though but I will try it again. Secondly, with this code:


start = time.time()
transform = transforms.Compose([transforms.Resize(200*200), transforms.CenterCrop(224), transforms.ToTensor(),]) 
# This gives us some transforms to apply later on

training = torchvision.datasets.ImageFolder(root = "training_set", transform = transform)
print(training)

train_dataloader = DataLoader(dataset = training, shuffle = True, batch_size = 32)
print(train_dataloader)




print("Before NN creation",start-time.time())
# Creating the model
class NeuralNetwork (nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.Linear_Stack = nn.Sequential(
            nn.Linear(200*200, 40000), 
            nn.ReLU(),
            nn.Linear(40000, 5),
            nn.ReLU(),
            nn.Linear(5, 1),
            nn.LogSoftmax(dim=1)
        )
    def forward (self, x):
        logits = self.Linear_Stack(x)
        return logits

model = NeuralNetwork()
print(model)

epochs = 1
check = 0
lr = 0.001
loss = nn.CrossEntropyLoss()
optim = torch.optim.SGD(model.parameters(), lr = lr)
print(loss, optim)  

#creating the main loop
def main(loader, model, loss, optim):
    size = len(loader.dataset)
    print(size)
    for batch_idx, (data, target) in enumerate(loader):
        # Prediction
        check = check + 1
        print(check)
        pred = model.forward(data)
        loss = loss(pred, target)
        #backprop
        optim.zero_grad()
        loss.backward()
        optim.step()
        print(loss)

How would I go about changing the values. Am I right in changing the transforms.Resize() value to exactly what I put into the Model, in this case 200*200. Admittedly, I blindly put that transform in thinking it did what I think it would do.

General advice, look at the docs! This will answer most of your questions. For example, here, it says that if you pass in a single parameter, it will resize the shortest side of the image to that (and maintain the aspect ratio).

First thing I would do though is to make sure you understand the basics of what you’re trying to do. Do you know what a fully connected neural net is? I see that your bio says you’re in high school and you’re just starting. That’s when I started! A common starting point for many (including myself) is Andrew Ng’s free machine learning course.

Yes I am. Yeet. Haha, 69. I’m in high school. Really childish and foolish.
Thanks @ptrmcl . How do I change the parameters of the full image to be 224 by 224?
BTW, does Andrew Ng’s course require Calulus??

Also, after doing some work after school,
This is m new code:

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

import numpy as np
from PIL import Image
import time

start = time.time()
transform = transforms.Compose([transforms.resize((225,225)),transforms.CenterCrop(224),transforms.ToTensor()]) 
# This gives us some transforms to apply later on
# transforms.Resize(225), 

training = torchvision.datasets.ImageFolder(root = "training_set", transform = transform)
print(training)

train_dataloader = DataLoader(dataset = training, shuffle = True, batch_size = 32)
print(train_dataloader)




print("Before NN creation",start-time.time())
# Creating the model
class NeuralNetwork (nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.Linear_Stack = nn.Sequential(
            nn.Linear(50176, 1000), 
            nn.ReLU(),
            nn.Linear(1000, 5),
            nn.ReLU(),
            nn.Linear(5, 1),
            nn.LogSoftmax(dim=1)
        )
    def forward (self, x):
        logits = self.Linear_Stack(x)
        return logits

model = NeuralNetwork()
print(model)

epochs = 1

lr = 0.001
loss = nn.CrossEntropyLoss()
optim = torch.optim.SGD(model.parameters(), lr = lr)
print(loss, optim)  

#creating the main loop
def main(loader, model, loss, optim):
    size = len(loader.dataset)
    print(size)
    for batch_idx, (data, target) in enumerate(loader):
        # Prediction
        
        pred = model.forward(data)
        loss = loss(pred, target)
        #backprop
        optim.zero_grad()
        loss.backward()
        optim.step()
        print(loss)

for x in range(epochs):
    main(train_dataloader, model, loss, optim)
    print(start-time.time())

I get this error

RuntimeError: mat1 and mat2 shapes cannot be multiplied (21504x224 and 50176x1000)

Can anyone point me out on what I’m doing wrong?

Resize((224,224))

Yeah you need some basic calculus + linear algebra. I think he gives a refresher?? If not there are lots of resources to learn that.

I just remembered this series which introduces the neural nets you’re using on images from a famous youtuber called 3 blue 1 brown. He has very nice visualizations.

you wanted to resize to 224x224 right? look what you’re passing in to resize

I get this

RuntimeError: mat1 and mat2 shapes cannot be multiplied (21504x224 and 50176x1000)

What does this message mean and can you help me fix it?

First, you need to resize all your inputs to a fixed size, then flatten your inputs. Flatten input size and first linear layer’s input sizes must match. Here is a sample code. Also as I recommend you to read docs as @ptrmcl suggested.

Also here is a sample for you. You can read what view do from docs. It is a simple function

>>> import torch
>>> input_tensor = torch.rand((1, 3, 224, 224)) # Input images resized to 224x224 in transform
>>> _, c, h, w = input_tensor.shape
>>> input_size_of_first_linear_layer = c * h * w
>>> flat_input_tensor = input_tensor.view(-1, input_size_of_first_linear_layer)
>>> input_tensor.shape
torch.Size([1, 3, 224, 224])
>>> flat_input_tensor.shape
torch.Size([1, 150528])
>>> layers = torch.nn.Sequential(
...     torch.nn.Linear(input_size_of_first_linear_layer, 1000),
...     torch.nn.Linear(1000, 2))
>>>
>>> layers
Sequential(
  (0): Linear(in_features=150528, out_features=1000, bias=True)
  (1): Linear(in_features=1000, out_features=2, bias=True)
)
>>> layers(flat_input_tensor)
tensor([[-0.0779,  0.3057]], grad_fn=<AddmmBac
1 Like

Thank you @m3tobom_M . After flattening, I get this tensor size - torch.Size([64, 19200])
Is that ok? Can I change it to be 1, 19200?
Secondly, I also get this weird error

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

For reference, this is my current code:

import torch
from torch import tensor
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

import numpy as np
from PIL import Image
import time

start = time.time()
transform = transforms.Compose([transforms.Resize((200,200)),transforms.CenterCrop(80),transforms.ToTensor()]) 
training = torchvision.datasets.ImageFolder(root = "training_set", transform = transform)
print(training)

train_dataloader = DataLoader(dataset = training, shuffle = True, batch_size = 64)
print(train_dataloader)




print("Before NN creation",start-time.time())
# Creating the model
class NeuralNetwork (nn.Module):
    def __init__(self,input_dims):
        super(NeuralNetwork, self).__init__()
        self.Linear_Stack = nn.Sequential(
            nn.Linear(input_dims, 1000), 
            nn.ReLU(),
            nn.Linear(1000, 5),
            nn.ReLU(),
            nn.Linear(5, 1),
            nn.LogSoftmax(dim=1)
        )
    def forward (self, x):
        logits = self.Linear_Stack(x)
        return logits

model = NeuralNetwork(19200)
print(model)


epochs = 1

lr = 0.001
loss = nn.CrossEntropyLoss()
optim = torch.optim.SGD(model.parameters(), lr = lr)
print(loss, optim)  

out = 0

def process(tensorin):
    global out
    _, c,h,w = tensorin.shape
    first_layer = c * h * w
    out = tensorin.view(-1, first_layer)
    return out
#creating the main loop
def main(loader, model, loss, optim):
    size = len(loader.dataset)
    print(size)
    for batch_idx, (data, target) in enumerate(loader):
        print(data.shape)
        process(data)
        print(target)
        print(out.shape)
        pred = model(out)
        loss = loss(out, target)
        #backprop
        optim.zero_grad()
        loss.backward()
        optim.step()
        print(loss)

for x in range(epochs):
    main(train_dataloader, model, loss, optim)
    print(start-time.time())

Thank you for any help (and help given - @m3tobom_M , @eqy, @ptrblck and @ptrmcl)

It seems you are defining out = 0 in the global scope, pass this integer then to the model in:

print(out.shape)
pred = model(out)

which I would assume should yield errors, since neither out.shape is defined nor would the model accept an int as the input, and later calculate the loss using out.
Besides that you are also overriding loss (the criterion) with the loss value, so wou might also want to fix this.