img, labels = data
optimizer.zero_grad()
output = model(img)
Target as in, here what I have passed to the model function or the train_loss = criterion(output)
and what is the significance of nb_classes, I mean how to determine its value.
nb_classes
is the number of classes in your use case.
You would usually use an output and target to calculate the loss in a criterion (at least for a supervised classification use case).
We would need to get the shapes and some more information to be able to help further.
for epoch in range(num_epochs):
for data in traindataloader:
img, _ = data
recon = model(img)
loss = criterion(recon, img)
So, in this case, recon is the output and img is the original image in the trainloader. So, we consider the classes to be use, then img variable is correct. I have a image dataset with random images, so how to determine the classes for that or just randomly choose any.
uts = []
for epoch in range(num_epochs):
for data in traindataloader:
img, labels = data
optimizer.zero_grad()
print(img.size())
img = Variable(img)
print(img.size())
labels = Variable(labels)
print(labels.size())
Size of data:[10,1,240,340]
Size of img :[ 10,1,240,340]
Does your input tensor contain class indices or does it contain some floating point values in a specific range, e.g. [0, 1]
?
In the latter case you wonāt be able to use nn.CrossEntropyLoss
without converting the input to class indices.
There are two problems that have bothered me for a long time
- Whether every time before I need to use linear layer in network modelļ¼I need to use the flatten layer first
- if 1ļ¼is true,Can I use view () or reshape () instead of flatten layer
-
If you were using e.g.
nn.Conv2d
layers before thenn.Linear
layer, then you usually would flatten the activation from the shape[batch_size, channels, height, width]
to[batch_size, in_features]
. While this is the āstandardā use case note that linear layer can accept inputs with multiple additional dimensions as[batch_size, *, in_features]
. In this use case the linear layer would be applied to all additional dimensions (*
) as if you would pass it in a loop. That being said, itās a special use case and not the common one. -
I would recommend to use
view
, as it would not trigger a copy of the data, but instead change the metadata of the tensor. If you get an error claiming your tensor is not contiguous, you would have to trigger the copy viatensor.contiguous().view(...)
or viatensor.reshape(...)
. Personally, Iām used to usingview
instead offlatten
, but that doesnāt mean itās the recommended way, so you can pick whatever works for you.
Thank you very much for your answer. I learned a lot. I may also need some practice to deepen my understanding. Thanks again.