Esimate bounding box coordinates from the pictures with data and noise

A n00b question. Just start learning PyTorch. The left side of the picture is generated from electrical devices. From human’s eyes, two bounding boxes correspond to the actual data; this part of the picture has higher altitude and is more dense. In comparison, the noise corresponds to the part with lower altitude.

I would like to extract the coordinates of two bounding boxes. Wonder if this is doable in PyTorch? Any code pointers, similar examples, etc? Thanks a lot!