Cuboid and obersvation Angle

Hi.
I’m writing a class Object to help me work with (3D SIDOD Dataset)[https://research.nvidia.com/publication/2019-06_SIDOD%3A-A-Synthetic].

Each object of the dataset has:

  • class
  • visibility
  • location (3* 1) representing the translation
  • quaternion (4* 1)
  • pose transform (33 rotation matrix + 31 translation)
  • cuboid_centroid (3*1)
  • projected_cuboid_centroid (2* 1)
  • bouding box (4*1)
  • cuboid (8 * 3)
  • projected cuboid (8 * 2)

Has it would result in a long training session (3 or 4 days training session with my GPU till convergence) to train a model to recognize the 9 keypoints of objects (8 corners projected points + centroid points) i would like to use an another technique to do the 3D task, especially the method that train a model to predict the orientation and the dimension of the object.

i can easily recover the dimension of the object with the cuboid (8 * 3).

    def getDimensionFrom3D(self):

        ''' corners: (8,3) no assumption on axis direction
                1 -------- 0
               /|         /|
              2 -------- 3 .
              | |        | |
              . 5 -------- 4
              |/         |/
              6 -------- 7
         '''

        l = np.sqrt(np.sum((self.cuboid[0, :] - self.cuboid[1, :]) ** 2))
        w = np.sqrt(np.sum((self.cuboid[1, :] - self.cuboid[2, :]) ** 2))
        h = np.sqrt(np.sum((self.cuboid[0, :] - self.cuboid[4, :]) ** 2))

        return [l, w, h]

but i’m struggling to understand how can i recover the observation angle using informations i have. (the same that kitti dataset provide).
Anyone can help me ?

Do you have information on the camera calibration intrinsics? If so, you can project the points in the image space into 3D space relative to the camera and calculate the viewing angle there using trigonometry. The OpenCV library should have most of what you need to do this.