1D Sparse Network - Using Conv1d

qdl · March 10, 2020, 3:59pm

Hello,

I am trying to implement and train a sparse network that looks like the following:

conv1d_flat-3

My understanding was that it is very similar to a 1D convolutional network with a single channel. So this is how I implemented it:

nn.Sequential(
    nn.Conv1d(in_channels=1, out_channels=1, kernel_size=3, bias=False),
    nn.Sigmoid(),
    nn.Linear(6, 1, bias=False),
    nn.Sigmoid()
)

Data looks like the following:

X = [
       [1, 0, 1, 1, 0, 1, 0, 0],
       [1, 1, 1, 1, 0, 0, 1, 1],
       ...
    ]

y = [
       [1],
       [0],
       ...
    ]

However, when I train it with simple inputs (like shown above) and examine the weights, they don’t look like I understood/expected them to be. So I reckoned perhaps that my architecture was incorrect. Can someone please let me know if this looks right?

PS: What I actually have is a hidden layer with the same number of dimensions as the input layer, so I have set padding=1 in my actual code. The padding=0 version was easier to show in a diagram.

ptrblck · March 11, 2020, 5:55am

The code looks alright.
Note that you would have to add nn.Flatten before the linear layer in case your conv layer returns more then a single output channel.
In the current setup it should be working.

What do you mean by: “examine the weights, they don’t look like I understood/expected them to be”?

qdl · March 11, 2020, 8:04pm

Thanks! I’ll try that.

What I meant is:

Suppose I want my network to learn a pattern such as 1101 (and say now the kernel size is 4), and my inputs were such as:

x_1 = 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0; y_1 = 1
...
x_k = 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0; y_k =  0

My understanding was that, after I achieved a predetermined acceptable level of accuracy, the learned weights would look something like [0.987, 0.99, -0.0008, 0.959]. Is that understanding incorrect?

ptrblck · March 11, 2020, 8:09pm

Not necessarily, since you are using non linear activation functions (sigmoid), the parameters might have lower or higher values.

qdl · March 13, 2020, 4:31am

The reason I use sigmoid is to emulate a step function. I was hoping to emulate an OR (hid->out) of ANDs (in->hid) with the synthetic dataset I made up. I’ll tinker around a bit more and get back, thanks!