Load text data from file

slavavs · June 17, 2019, 10:55am

I have a file with text data (dataset for predict).
eg. such (many rows, three columns)
34, 56, 76
44, 55, 79
45, 79, 87
…
The file is large, about 700mb.
How best can I convert this file to pytorch?

Pranavan_Theivendira · June 17, 2019, 1:32pm

Hi,

You can do the following. Replace file.csv with your file name.

from numpy import genfromtxt
my_data = genfromtxt('file.csv', delimiter=',')

You will get a numpy array(Nx3). There may be other ways in pytorch. This might help.

Thanks

slavavs · June 17, 2019, 1:58pm

Thanks, I did just that.
But I was embarrassed by the data output.

if i do like this:

b = "1.0;0.0;512.50;507.00;508.25;122.0;20.0;0.0;20.0"
c = b.split(';')
r = [float(item) for item in c]
print(r)

[1.0, 0.0, 512.5, 507.0, 508.25, 122.0, 20.0, 0.0, 20.0]

well, if i do like this:

st = np.genfromtxt('123.txt',delimiter=';')
print(st)

[1.00000e+00 0.00000e+00 5.12500e+02 … 2.00000e+01 0.00000e+00
2.00000e+01]

Missing comma between numbers.

slavavs · June 17, 2019, 2:28pm

I did
t = torch.as_tensor(st, dtype=torch.float32)
now everything is fine

slavavs · June 18, 2019, 3:48pm

if i do the distribution of numbers so ->
st = np.interp(3.4, [0,10], [-1,1]) = -0.31999999999999995

And if I do through the function np.genfromtxt ->
st = np.interp(np.genfromtxt('123.txt',delimiter=';',dtype='d'), [0,10], [-1,1]) = [-0.6 -1. -0.2 -0.6 0.4 0. 0.4 -1. 0.4 0. 0.2 0.9
0.4 0. 0.2 0.95 -1. -1. 0.2 -1. -1. -1. -1. -1.
-1. -1. -0.6 0.6 -1. -1. -0.6 -1. ]
function round numbers. But for normalization, I need to know the full number. Help return values.