I’m stumped by the simplest part of the most basic “Titanic Basic” captum tutorial: converting the data into tensors?!
After getting the data and performing the first preprocessing steps,
converting to numpy arrays and separating out train and test sets works fine:
data = titanic_data.to_numpy()
train_indices = np.random.choice(len(labels), int(0.7*len(labels)), replace=False)
test_indices = list(set(range(len(labels))) - set(train_indices))
train_features = data[train_indices]
train_labels = labels[train_indices]
test_features = data[test_indices]
test_labels = labels[test_indices]
but converting to tensors doesn’t work:
File ".../Titanic_Basic_Interpret.py", line 139, in <module>
input_tensor = torch.as_tensor(train_features,dtype=torch.float32)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
Maybe the problem is “no more magic, convert_objects has been
deprecated in pandas 0.17” ? It seems the tutorial was added back in 2019, but other issues seem to have used it more recently?
I’ve tried some of the suggestions there (building a separate dictionary of data types and then data = data.astype(dtype=dtypeDict)
, converting each column separately:
for c in titanic_data.columns:
titanic_data[c] = pd.to_numeric(titanic_data[c])
but these don’t go thru either. What could the issue be?!