Extending FastRCNN model to use image + meta-data input

I want to extend FastRCNN to use both image and metadata input (currently I have 3 features per image as meta-data). My VisitonDataset contains a tuple (image + metadata) and a tensor target (i.e. ((tensor_image, torch_meta), torch_target)).

I define model as:
import torchvision.models.detection as models
model = models.fasterrcnn_mobilenet_v3_large_320_fpn(weights=“DEFAULT”)
model.roi_heads.box_predictor = models.faster_rcnn.FastRCNNPredictor(model.roi_heads.box_predictor.cls_score.in_features, len(classes))

I tried adding a sequential module and an out module using:
model.add_module(“MetaNet”, torch.nn.Sequential(collections.OrderedDict([
** (“lin1”, torch.nn.Linear(len(meta_features), 500)),**
** (“batch1”, torch.nn.BatchNorm1d(500)),**
** (“relu1”,torch.nn.ReLU()),**
** (“dropout1”,torch.nn.Dropout(p=0.2)),**
** (“lin2”,torch.nn.Linear(500, 250)),**
** (“batch2”,torch.nn.BatchNorm1d(250)),**
** (“relu2”,torch.nn.ReLU()),**
** (“dropout2”,torch.nn.Dropout(p=0.2))])))**
model.add_module(“OutNet”, torch.nn.Linear(4+250, 4))

So not, apart form the general Fast RCNN modules, I get (at the end):

(MetaNet): Sequential(
** (lin1): Linear(in_features=3, out_features=500, bias=True)**
** (batch1): BatchNorm1d(500, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)**
** (relu1): ReLU()**
** (dropout1): Dropout(p=0.2, inplace=False)**
** (lin2): Linear(in_features=500, out_features=250, bias=True)**
** (batch2): BatchNorm1d(250, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)**
** (relu2): ReLU()**
** (dropout2): Dropout(p=0.2, inplace=False)**
** )**
** (OutNet): Linear(in_features=254, out_features=4, bias=True)**

I want to concatenate the output of the FastRCNN with the Sequential model using metadata. How can I then update the forward (and backward) to train my classifier using the metadata.