Hi.
I’m writing a class Object to help me work with (3D SIDOD Dataset)[https://research.nvidia.com/publication/2019-06_SIDOD%3A-A-Synthetic].
Each object of the dataset has:
- class
- visibility
- location (3* 1) representing the translation
- quaternion (4* 1)
- pose transform (33 rotation matrix + 31 translation)
- cuboid_centroid (3*1)
- projected_cuboid_centroid (2* 1)
- bouding box (4*1)
- cuboid (8 * 3)
- projected cuboid (8 * 2)
Has it would result in a long training session (3 or 4 days training session with my GPU till convergence) to train a model to recognize the 9 keypoints of objects (8 corners projected points + centroid points) i would like to use an another technique to do the 3D task, especially the method that train a model to predict the orientation and the dimension of the object.
i can easily recover the dimension of the object with the cuboid (8 * 3).
def getDimensionFrom3D(self):
''' corners: (8,3) no assumption on axis direction
1 -------- 0
/| /|
2 -------- 3 .
| | | |
. 5 -------- 4
|/ |/
6 -------- 7
'''
l = np.sqrt(np.sum((self.cuboid[0, :] - self.cuboid[1, :]) ** 2))
w = np.sqrt(np.sum((self.cuboid[1, :] - self.cuboid[2, :]) ** 2))
h = np.sqrt(np.sum((self.cuboid[0, :] - self.cuboid[4, :]) ** 2))
return [l, w, h]
but i’m struggling to understand how can i recover the observation angle using informations i have. (the same that kitti dataset provide).
Anyone can help me ?