Torch Tensor to Plot to Torch Tensor again

bornexmachina · May 9, 2023, 8:41pm

I try to follow an approach of a paper, where guys

use librosa to create a melspectrogram plot
use this picture as input for their pytorch pipeline.

Now in the meantime torch audio can create a basis for the plot. And matplotlib.imshow() generates the picture I would like to use as the input (here you can see my related question with a plot)

I can generate a grey scale image from the data using

def scale_minmax(X, XMIN, XMAX, min=0.0, max=1.0):
    X_std = (X - XMIN) / (XMAX - XMIN)
    X_scaled = X_std * (max - min) + min
    return X_scaled

But I have the impression, that I am losing some details here, as I cannot recreate the results of the paper.
Thus I would follow the footsteps and process the colorful image and let pytorch do its magic.
By this I mean, that I would like to get a 3D array representing RGB channels, which later will be used as input for pretrained ResNet model (which needs 3D arrays)

One very tedious way is something like

import torch
from PIL import Image
import matplotlib.pyplot as plt

# Generate a sample plot
x = [1, 2, 3, 4]
y = [1, 4, 9, 16]
plt.plot(x, y)

# Save the plot as an image
plt.savefig('plot.png')

# Open the image using PIL
img = Image.open('plot.png')

# Convert the image to a PyTorch tensor
tensor_img = torch.from_numpy(numpy.array(img))

but this requires the intermediate save of a picture I don’t actually need.

Thus, my question: Is there any way to bypass this save?

Either some RGB extraction from my initial array (instead the grey scale I got) or at least passing the matplotlib.imshow() directly to torch.

eqy · May 10, 2023, 4:33am

You may want to take a look at Agg buffers: Agg Buffer — Matplotlib 3.1.2 documentation