JSON list of variable size

Hi! I’ve recently delved into the world of ML. I’m sure this must’ve be a very simple or foolish question but I’ve not found a good answer to this. :’(

I have a piece of code that generates JSON representing maps for a game.

[
  { x1: 0, y1: 0, x2: 10, y2: 3, type: 'GRASS' },
  { x1: 3, y1: 2, x2: 8, y2: 3, type: 'ROCK' },
  { x1: 0, y1: 0, x2: 10, y2: 3, type: 'WATER' },
  ...
]

Each map representation is composed of a list of elements with different bounding boxes and a type of element that occupies that area. Furthermore, the map is not tile-based, meaning I cannot use a 2-D matrix of type values as a representation. Lastly, as the map is not tile-based, the number of elements in the map JSON can be arbitrary (although there is an upper limit to this number). I have a ton of these JSONs and I’ve manually classified them into groups based on similarity.

I was wondering if it was possible to devise an ML model that would be capable of processing this sort of input data and computing similarity scores between two maps. Most tutorials I’ve gone through either talk about simple tensors (XOR) or images, with nothing in between.

As a last resort, I can use the JSON to generate an image of the map itself, where type might be encoded as different color or texture. But I think (please correct me if I’m wrong) that the JSON represents the map image more simply and it a NN trained on JSON might perform better than one trained on generated images. In the JSON, the NN should know that there are only 5 parameters to deal with. On the other hand, in an image, does it consider the shape of the tiles, or does it consider the color, or texture, or positioning, or number of corners, or a whole host of other parameters.

I believe a NN trained on JSON should perform better than one trained on images. But is there a way to transform this JSON into a format that is compatible with a NN? The main problem I’m facing is that even though each element in the JSON has exactly 5 fields, but the number of elements is not constant.

I can try padding, but that would impose a rigid upper limit and also force the input layer to have way more neurons than would be required in most cases.

Newbie here. Please be gentle. :slight_smile:

Asim

I’m not fully understanding the use case and don’t know what the target would be.
Given the x and y values, would you like to train a model to predict the type?
If so, wouldn’t each row be a new sample and you thus wouldn’t need any padding?

Thank you for responding @ptrblck :slight_smile: . I have labelled my maps like so:

{
  'maps-with-water': [ mapJson1, mapJson2, mapJson3 ],
  'cave-maps': [ mapJson4, mapJson5, mapJson6 ],
  'desert-maps': [ mapJson7, mapJson8 ],
  ...
}

Each mapJson consists of:

{
  elements: [
    { x1: ..., y1: ..., x2: ..., y2: ..., type: ... },
    { x1: ..., y1: ..., x2: ..., y2: ..., type: ... },
    { x1: ..., y1: ..., x2: ..., y2: ..., type: ... },
    ...
  ],
}

I want my model to accept a whole mapJson with all its elements and predict the class it belongs to. So if I test it with mapJson4, it should predict cave-maps.

I have two work-arounds at the moment that I can think of, both of which are not pretty:

  1. I can pad a mapJson so that it always has (say) 50 elements (slots) and if a particular mapJson has only 3 elements, I’ll pad the rest of the 47 elements to maybe { x1: 0, y1: 0, x2: 0, y2: 0, type: 'UNKNOWN' }.
  2. I can convert a mapJson into an image where the entire image is black, except areas where there were elements. For these areas, I’ll color the pixels according to the element type (green for grass, brown for rock, blue for water, etc). This will result in an image that can be up/down-sampled to conform to the input layer of a NN and there will be no need to send a variable length JSON to the model.

I hope I’m able to clarify my wishes. :). I don’t know if this is possible. But I wanna be pushed in the right direction as to how to approach such a problem.

Thanks for clarifying the use case as it sounds interesting.
I wouldn’t know which approach might work best as both seem valid. Let’s see if others have some experience with similar use cases. :slight_smile: