How to find the 3D coordinates from the 2D coordinates of two different views video feeds?

I have used deep learning to predict the 2D keypoints of a deformable object in two different views of the object video feeds.

How can I retrieve the 3D keypoints if I have 2D keypoints for each corresponding frame? Is there a framework that already does that? Looking for some expert advice here.

This sounds like a general Triangulation use case. If Iā€™m not mistaken, OpenCV should have some algorithms for it, but unsure what their requirement is (besides calibrated cameras).

1 Like