Hello!
I wish to understand which lines and vertices in different 2D orthographic views of a 3D object correspond to each other. This information would also later be used to construct a 3D model from the 2D orthographic views.
Blue shows matched edges/lines. Orange shows matched nodes/vertcies. Circular objects seem especially difficult.
So far it seems like it would be sensible to use a graph neural network to solve this task. Initial ideas, structure, features are as follows (general, more certain):
- Each vertex is a node in the graph
- Node feature vector would include the x-y coordinates relative to the view
- Each line on the drawing is an edge between nodes in the graph
- Edge feature vector would include:
- Edge type (in addition to straight lines there are are also circles and arcs)
- Edge length
- If dimension text is defined next to the edge (this is a mechanical engineering drawing related property, with the importance being that equivalent edges in a mechanical engineering drawings should have the length defined for them only once)
- Edge feature vector would include:
Do you have any suggestions for the following:
-
What network architecture(s) would be worth a try?
-
Should a hierarchical graph structure (and GNN) be used?
- A hypernode representing the entire view, which is connected to all other nodes in the view
- A global node connected to all hypernodes, in order to capture the relation between different views
- See this image from a blog to understand what I mean: https://distill.pub/2021/gnn-intro/multigraphs.1bb84306.png
-
Any thoughts about other relevant edge, node and potentially global features?
-
How would You define this task? Is it link prediction, node classification, graph matching, etc.?
- This task can probably be approached in many different ways, what seems logical to You?
-
Mechanical engineering drawings often also contain an isometric view, could this be relevant somehow?
- Notice that an entirely isometric view dependent solution does not work for all drawings then, however it could be still relevant if works with high accuracy or does not require too much “side-tracking”.
Feel free to ask any additional questions or engage in discussion (some more uncertain ideas left out to not cause unnecessary confusion / make the post too long).
Thanks for any help!