Which NLP model to use to handle long context?

I’m trying to process product data for an e-commerce platform. The goal is to understand products’ size.

Just to show you some examples on how messy product dimension description is:

Overall Dimensions: 66 in W x 41 in D x 36 in H
Overall: 59 in W x 28.75 in D x 30.75 in H
92w 37d 32h",
86.6 in W x 33.9 in D x 24 in H
W: 95.75\" D: 36.5\" H: 28.75\"",
W: 96\" D: 39.25\" H: 32\"",
"118\"W x 35\"D x 33\"T.",
"28 L x 95 W x 41 H"
"95\" W x 26.5\" H x 34.75\" D"
"98\"W x 39\"D x 29\"H"
"28\" High x 80\" Wide x 32\" Deep"

Now assume that the product dimension description is short < 60 characters, I trained a two layer bidirectional LSTM, which can handle this task perfectly.

But the problem is, the above dimension description is usually embedded in a long context (as a part of the product description). How can I extract the useful information from the long context and understand it? My LSTM can only accept context size of 60 (the general idea is the same as https://towardsdatascience.com/addressnet-how-to-build-a-robust-street-address-parser-using-a-recurrent-neural-network-518d97b9aebd)

What language model is more suitable for this?

Just a suggestion: I don’t think your data looks that messy. Writing one or more RegEx to extract all dimensions correctly might be a challenge, but it shouldn’t be too difficult to come up with a RegEx that finds a relevant substring of size <60 in a longer text, which you then can give to your existing LSTM.

The RegEx doesn’t even have to be that specific since it’s just a preprocessing step. For example: All substrings containing 3 numbers. Then you include the 3 word before and after that substring and check if it’s below <60 tokens. That should do it.

I only showed the tip of the ice berg. trust me, it is very very messy.

Just to give you an example where the 3 numbers idea will fail:

Seat height: 32, arm height: 12, back height: 50, seat length: 72, depth: 89

This could be the dimensions of a sofa.

plus, for learning purpose, I still want to know how to handle long context with a nlp model.