Bidirectional NN with two inputs and one output


*Disclaimer, my first post to stackoverflow is coming to mind where I got flamed for poor formatting, I’m sorry ahead of time if I broke a formatting rule.

I am creating a neural network model to mimic three tasks used for clinical assessments in stroke survivors. The data structures of the model are word embeddings from gloves’ wiki-gigaworld and the output are a series of numbers reflecting the letters needed to spell the word that coincides with each vector.

The three tasks I am modeling are a meaning to word task, a word repetition task, and a task matching a word to its meaning (the opposite of the first task). Note, that I am NOT using audio files or images. I have previously created this model with LENS (light efficient network simulator), but am transitioning to python for the ease of data prep, implementation, etc.

I am providing a picture of the network I am trying to create. Where I have labeled arpab![pytorch example et_in/out represents the letter/number sequences that are used to represent each word, and embeddings are the word embeddings from glove.

The code I am providing reflects part of the model I am trying to create, which is colored in red in the image. It works! My question is 1) can I add the remaining structure (colored in blue) of my intended network when i define the bidirectional network? If so, can anyone provide any guidance as to how I can complete this task or even point me in the direction of some code resources to aid in the process?

class BiDir(torch.nn.Module):
    def __init__(self, weights, emb_dim, hid_dim, rnn_num_layers=2):
        #Embedding layers using glove as the pretrained weights
        self.embedding = nn.Embedding.from_pretrained(weights)
        #Bidirectional GRU module for forward pass with 2 hidden layers
        self.rnn = torch.nn.GRU(emb_dim, hid_dim, bidirectional=True, num_layers=rnn_num_layers)
        self.l1 = torch.nn.Linear(hid_dim * 2 * rnn_num_layers, 256)
        self.l2 = torch.nn.Linear(256, 2)

    def forward(self, samples):
        #Forward Pass        
        embedded = self.embedding(samples)
        _, last_hidden = self.rnn(embedded)
        hidden_list = [last_hidden[i, :, :] for i in range(last_hidden.shape[0])]
        #Calculating the loss
        encoded =, dim=1)
        #RELU and Sigmoid Activation Function
        encoded = torch.nn.functional.relu(self.l1(encoded))
        encoded = torch.nn.functional.sigmoid(torch.FloatTensor(self.l2(encoded)))

        return encoded
#weights = pretrained embeddings of length 300
#1392 words in current model
model = BiDir( weights, 300, 1392, rnn_num_layers=2)
criterion = torch.nn.MultiLabelSoftMarginLoss()
optimizer = torch.optim.Adam(model.parameters())
for epoch in range(1):
    #Running the model for 20 epochs
    for batch in dataset_iter:
        output = model(batch.text)
        loss = criterion(output, batch.label)
    with torch.no_grad():
        acc = torch.abs(output - batch.label).view(-1)
        #acc = acc.sum() / acc.size()[0] * 100.
        #Calculating the accuracy
        acc = (1. - acc.sum() / acc.size()[0]) * 100.
        print(f'Epoch({epoch+1}) loss: {loss.item()}, accuracy: {acc:.1f}%')

Hi everyone, somewhat of a shameless plug here to bump the thread.

I have been looking for assistance in this problem throughout the forum and on google. I did find this postpost, which does seem promising; however, I am still unsure how to implement it in my code. I also found this link, which does seem simple and effective.

Given the lack of responses, I think I may not have provided enough code for assistance. So I would like to update that as well.

First I load the weights from Glove

weights = torch.FloatTensor(glove_model300.vectors)
# import necessary libraries
from torchtext import vocab
import torchtext.vocab as vocab
import as data
from import Field, Dataset, Example
#create dataframe set
class DataFrameDataset(Dataset):
    """Class for using pandas DataFrames as a datasource"""
    def __init__(self, examples, fields, filter_pred=None):
        self.examples = examples.apply(SeriesExample.fromSeries, args=(fields,), axis=1).tolist()
        if filter_pred is not None:
            self.examples = filter(filter_pred, self.examples)
        self.fields = dict(fields)
        # Unpack field tuples
        for n, f in list(self.fields.items()):
            if isinstance(n, tuple):
                self.fields.update(zip(n, f))
                del self.fields[n]
class SeriesExample(Example):
    def fromSeries(cls, data, fields):
        return cls.fromdict(data.to_dict(), fields)

    def fromdict(cls, data, fields):
        ex = cls()
        for key, field in fields.items():
            if key not in data:
                raise ValueError("Specified key {} was not found in "
                "the input data".format(key))
            if field is not None:
                setattr(ex, key, field.preprocess(data[key]))
                setattr(ex, key, data[key])
        return ex
class TextMultiLabelDataset(Dataset):
    def __init__(self, df, tt_text_field, tt_label_field, txt_col, lbl_cols, **kwargs):
        # torchtext Field objects
        fields = [('Word', tt_text_field)]
        for l in lbl_cols: fields.append((l, tt_label_field))
        is_test = False if lbl_cols[0] in df.columns else True
        n_labels = len(lbl_cols)
        examples = []
#Code found and forked from
class TextMultiLabelDataset(Dataset):
    '''Data Set Class for Multilabel classification'''
    def __init__(self, df, tt_text_field, tt_label_field, txt_col, lbl_cols, **kwargs):
        # torchtext Field objects
        #Extracting the Text in the torch module
        fields = [('text', tt_text_field)]
        #Extracting all the labels 
        for l in lbl_cols: fields.append(('label', tt_label_field))
        is_test = False if lbl_cols[0] in df.columns else True
        n_labels = len(lbl_cols)
        examples = []
        #Iterating through all the labels
        for idx, row in df.iterrows():
            if not is_test:
                lbls = [ row[l] for l in lbl_cols ]
                lbls = [0.0] * n_labels
            txt = str(row[txt_col])
            examples.append(data.Example.fromlist([txt]+lbls, fields))
        super().__init__(examples, fields, **kwargs)

    def sort_key(example): 
        return len(example.text)
    def splits(cls, text_field, label_field, train_df, txt_col, lbl_cols, val_df=None, test_df=None, **kwargs):
        # build train, val, and test data
        train_data, val_data, test_data = (None, None, None)
        if train_df is not None: 
            train_data = cls(train_df.copy(), text_field, label_field, txt_col, lbl_cols, **kwargs)
        if val_df is not None: 
            val_data = cls(val_df.copy(), text_field, label_field, txt_col, lbl_cols, **kwargs)
        if test_df is not None: 
            test_data = cls(test_df.copy(), text_field, label_field, txt_col, lbl_cols, **kwargs)

        return tuple(d for d in (train_data, val_data, test_data) if d is not None)
#build multilabel dataset from variable "phon_dup"
#Loading the Dataset
X_train_dataset = TextMultiLabelDataset(phon_dup,TEXT,LABEL, ['Word'], ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12','13', '14'])
#load pretrained embeddings to build text vocabulary
embedding = nn.Embedding.from_pretrained(weights)
#Building the Train Dataset with the Vocab
#numericalizes each word
#Marking the Lables with the Vocab
# DataIterator for iterating through the Dataset that we have created.
dataset_iter = data.Iterator(X_train_dataset, 1)
for batch in dataset_iter:

After this block of newly added code I run the code initially provided.

To ask my question again, with hopes of creating a more clear picture of the advice/help I am seeking.

The code I have provided creates a functional recurrent neural network. It has input of word embeddings and an output of number strings, where each number in the tensor reflects a letter needed to spell the word coincided for each word embedding.

What I would now like to add is what is outline in blue in the figure above. That is, an idewntical hidden/recurrent layer structure as to the current model, but with an identical input/output of the tensor “dataset_iter.” This is reflected in the figure as Arpabet in <->Hidden<-> Recurrent 2 <-> arpabet out.

The second portion of the model that I would like to add is a single hidden layer that connects Arpabet in to Embeddings.

I am fully aware that this model achieves tasks that don’t at first seem relevant to a crowd doing computer vision (awesome!), or other really advanced NN modeling. However, my goal in the production of this model, is to inform a line of clinical research that may aid in improving treatment/assessment of stroke survivors.

I hope this catches someones eye and I am happy to provide more details/code if I have not provided enough.