Hi,
I’m working on an end-to-end robotic grasp planning method where the input to the network is an image, a grasp G (defined as a vector x,y,z) and an output v ranging from 0-1 that indicates how good the corresponding grasp is. The problem I’m facing is how to create a custom data-set to handle the case where one image have multiple possible grasps and outputs v.
For instance, when training the network, it should only take one image and one grasp at a time, but still be able to train on all or at least a subset of all grasps for that single image. If I, however, only load images as batches, I have no control on how to load the number of grasps corresponding to that image.
One option might be to create two separate data-sets, one for the image and one for the grasps and outputs, and load the data from these. I do, however, hope for a more elegant solution where I only need one custom data-set.