Real Numbered Analog Classification for Neural Networks

Hi everyone,

I am fairly new to Pytorch and I’m currently working on a project that needs to perform classification on images. However, it’s not a binary classification.

The outputs of the neural network are real numbers. For instance the classification I’m looking the neural network to provide is as such:

  1. Reads Image
  2. says that the image has attribute A at a value of 1200 and another attribute B at a value of 8.

The image data that’s fed into this neural net usually has a value range of 1200 - 1800 and another attribute range of 4 - 20. The main goal is to train the network to give an estimated analog value based on the image fed into it.

The classes provided by Pytorch seem to favor binary classification, such as the Dataloader and ImageFolder classes. However, I cannot provide the class folder structure that is often used since I do not have set binary classes:


I want to discuss what possible structures of neural networks can support a problem like this. Including activation functions and what Pytorch might have to offer.

Are there implementations out there that I haven’t found on the web yet? What type of loss functions and optimizers in Pytorch may best serve this type of problem? Should I even be considering neural networks to perform this task in the first place?

To be clear I’m not asking for code or a complete solution. Just a discussion on developing a neural net that can work with this analog system.

Also if I need to clarify anything I will be happy to.

Hi Andrew!

A quick note on terminology: To me, at least, the term
“classification” has the connotation of assigning input
samples to discrete classes, that is, labelling them with
integer class labels. So I might call what you’re talking
about something like continuous value estimation, rather
than “classification.”

This seems perfectly reasonable to attempt with a pytorch
neural network (depending on the details of your problem).

This is an overstatement. Pytorch certainly supports binary
classification, but it supports lots of other things, too.

Yes, while you can certainly encode binary class labels (and
multi-class labels) in a directory structure, you can’t very
well encode real-number attributes that way.

So the straightforward approach would be to have a data file
that has the real-number attributes in it. It could be as
simple as containing a sequence of (pairs of) real numbers,
with the understanding that the order in which you read in
your images will (somehow) match the order of the attribute
values in your file. Or your data file could have a character
string for the filename of each image file, followed by two
numbers for the values of the two attributes.

Since you’re analyzing images your network would probably
start with a couple of convolutional layers, and then switch
over to a couple of fully-connected layers, with the last
layer having two (floating-point number) outputs as the
predictions for your two attributes. (There are many
additional techniques that could improve the performance
of your network – things like nn.MaxPool2d or
nn.Dropout2d – but I would recommend starting simple
and adding refinements as the need arises. Or start with
a prepackaged network.)

Rectified linear units (nn.ReLU) are commonly used as
(non-linear) activation functions.

For your loss function, you would probably simply use the squared
error between your predictions and actuals (nn.MSELoss).

That is:

loss = (predicted_A - actual_A)^2 + (predicted_B - actual_B)^2.

You might want to weight the two terms so they have comparable
size lest you preferentially optimize for accuracy in A at the
cost of B.

I always like to start simple, so I would recommend starting
with plain-vanilla (stochastic) gradient descent (optim.SGD).
You can also add momentum to gradient descent (supported by
optim.SGD), and pytorch offers more sophisticated optimizers,
such as the commonly-used optim.Adam (“adaptive moments”).

Undoubtedly! There are lots of (semi-) prepackaged architectures
out there for image processing, but I don’t know much about them
and don’t have any concrete suggestions. But if you give us more
detail about your specific problem (and post some sample images!),
some of the experts here will likely have good advice.

Likely yes, but it really depends on your problem. (So post
details and sample images!) For example, if the two attributes
you are trying to “predict” are the intensity and and saturation
averaged over the pixels in a color image, you’d be much better
off just calculating them directly.


K. Frank

Hi K. Frank,

Thank you for taking the time to answer my questions! I appreciate it!

I think that’s a better name for this project. I’ll use that from now on.

Yes, I have this setup nicely. I can parse a png file name for the attributes and make them into labels.

I agree, after looking through the loss functions this seems like a good choice for real value estimation.

This is very true. Since the values ranges for A and B are much different I would need to go through this. I haven’t looked into how to weight the MSE loss function yet, but is it possible that you could give some guidance on this?

Lastly, I wanted to make batched sets of data. I found that Pytorch does this automatically through the Dataloader class via a parameter. I am confused on whether this Dataloader still needs the dataset parameter structured in the hierarchical fashion previously mentioned. If that’s the case would there be anything else to quickly batch data or should I code my own function?

Thanks again,

Hello Andrew!

First, a general comment:

We will be able to give you advice that is more
likely to be useful to you if you give us some
concrete detail about the problem you are working

How big are your images? How many will you be
training on? What do they look like? What is
the typical distribution of your two attributes?
What is the conceptual meaning of your attributes?

I don’t believe that nn.MSELoss has a built-in
way to include these relative weights. There are
a number of straightforward approaches to including
such weights.

Myself, I would just write my own loss function,
something like this:

import torch

# define weighted loss function

def wtSqErr (pred, targ, wts):
    return (wts * (pred - targ)**2).mean()

# construct some sample data
# use a batch size of 10
# y_targ are the actuals, y_pred are the predictions
# which, for this example, are the actuals plus noise

y_targ = torch.tensor ([1000.0, 2.0]) * torch.randn (10, 2) + torch.tensor ([2000.0, 3.0])
y_pred = y_targ + torch.tensor ([100.0, 0.15]) * torch.randn (10, 2)
y_pred.requires_grad = True

# set up the weights for the loss

wtA = 1.0 / 1000.0**2
wtB = 1.0 / 2.0**2

wtAB = torch.tensor ([wtA, wtB])

# calculate loss

loss = wtSqErr (y_pred, y_targ, wtAB)

# show that autograd works

print (y_pred.grad)
print (y_pred.grad)

Here is the output of the above script:

>>> import torch
>>> # define weighted loss function
>>> def wtSqErr (pred, targ, wts):
...     return (wts * (pred - targ)**2).mean()
>>> # construct some sample data
... # use a batch size of 10
... # y_targ are the actuals, y_pred are the predictions
... # which, for this example, are the actuals plus noise
>>> y_targ = torch.tensor ([1000.0, 2.0]) * torch.randn (10, 2) + torch.tensor ([2000.0, 3.0])
>>> y_targ
tensor([[2.3612e+03, 2.4401e+00],
        [2.2880e+03, 7.0144e+00],
        [1.2435e+02, 4.6300e+00],
        [3.7007e+03, 1.4845e+00],
        [1.7911e+03, 2.0490e+00],
        [2.6058e+03, 2.2381e+00],
        [6.1270e+02, 2.1648e+00],
        [6.9680e+02, 1.4656e+00],
        [1.2903e+03, 2.8559e+00],
        [1.6696e+03, 5.5197e+00]])
>>> y_pred = y_targ + torch.tensor ([100.0, 0.15]) * torch.randn (10, 2)
>>> y_pred.requires_grad = True
>>> y_pred
tensor([[2.6065e+03, 2.5329e+00],
        [2.3034e+03, 7.2111e+00],
        [2.9170e+02, 4.4378e+00],
        [3.7426e+03, 1.4848e+00],
        [1.8188e+03, 2.2676e+00],
        [2.8676e+03, 2.3148e+00],
        [5.6415e+02, 2.1441e+00],
        [7.7348e+02, 1.4650e+00],
        [1.2437e+03, 2.9639e+00],
        [1.5545e+03, 5.5731e+00]], requires_grad=True)
>>> # set up the weights for the loss
>>> wtA = 1.0 / 1000.0**2
>>> wtB = 1.0 / 2.0**2
>>> wtAB = torch.tensor ([wtA, wtB])
>>> wtAB
tensor([1.0000e-06, 2.5000e-01])
>>> # calculate loss
>>> loss = wtSqErr (y_pred, y_targ, wtAB)
>>> loss
tensor(0.0111, grad_fn=<MeanBackward1>)
>>> # show that autograd works
>>> print (y_pred.grad)
>>> loss.backward()
>>> print (y_pred.grad)
tensor([[ 2.4533e-05,  2.3200e-03],
        [ 1.5451e-06,  4.9169e-03],
        [ 1.6735e-05, -4.8059e-03],
        [ 4.1961e-06,  8.9884e-06],
        [ 2.7656e-06,  5.4653e-03],
        [ 2.6174e-05,  1.9158e-03],
        [-4.8546e-06, -5.1618e-04],
        [ 7.6680e-06, -1.5900e-05],
        [-4.6640e-06,  2.7002e-03],
        [-1.1507e-05,  1.3345e-03]])

Note that if you use pytorch tensors to do your
calculations, autograd will work for you without
your having to do anything special.

Pytorch naturally works with batches. The first
index of your input data, predictions, and target
data tensors is the index that indexes over samples
in the batch.

In the above example, you can understand the
generated data to be a batch of 10 samples.

(In fact, pytorch loss functions require batches,
even if the batch size is only 1. Following the
above example, for batch-size = 1, a “batch” of,
say, predictions would then have a shape of
y_pred.shape = torch.Size ([1, 2]).)

If your training data fits in memory (We don’t
know – you’ve told us nothing concrete about
your problem.), you can read it all into one
tensor, and then use “indexing” or “slicing”
to get your batches.

import torch

all_data = torch.ones (10, 3)
first_batch_of_two = all_data[0:2]
second_batch_of_two = all_data[2:4]

Doing this does not create new tensors with their
own storage – it just sets up a view into the
existing all_data tensor.

Good luck.

K. Frank

Hi K. Frank,

Thank you so much for the detailed response! I have a prototype somewhat working now from your help!
I am going to work on getting the loss down faster and get better prediction values sometime soon.

Thank you again,
Andrew Smith