As in we are adding more information to the fully connected layers than simply what the conv layers tell us. For example imagine doing NLP on movie reviews but you know the type of movie and you know which actors were in it etc would you be able to add that information to the fully connected layers while having the conv layers analyze the actual sentences of the review.
Is this possible? Any examples I could look at ? Is it worth trying out this technique?
How would you like to split your input data?
Since it seems you would like to use a one-dimensional conv layer, your input should be of shape [batch_size, channels, length].
Are you splitting X_train based on the length? Also, do you want to concatenate the split input?
If so, this could be a starter code:
@ptrblck works perfectly! By any chance do you anyone who does similar things so that I can look at what hyper parameters and architecture they are using? Read all of cs231n was super interesting thank you for that advice last week.
sorry to bother you. It looks like I have the opposite problem of the thread. My Network won’t overfit. Below is the loss curve. When it jumps downwards it is becuase I have decreased the learning rate (learning rate annealing). Using Adam as my optimizer. y axis is loss and x axis is number of epochs.
I would scale down the problem to just a single input and try to overfit your model.
If that’s not possible, your architecture, the hyperparameters or the training routine might have a bug or are not suitable for the problem.