I am finding a way to treat high-dim sparse tensors.

I downloaded this data from the UCI dataset. It is a tabular dataset with a size of 67557 × 42. Each row can take a categorical quantity (o, ×, b).

```
import numpy as np
import torch
import pandas as pd
from ucimlrepo import fetch_ucirepo
dataset = fetch_ucirepo(id=26)
X = dataset.data.features
X
```

I want to convert this data into a sparse tensor as follows. Firstly, I replace (o, ×, b) with integers (0,1,2).

```
X = X.astype('category')
cat_columns = X.select_dtypes(['category']).columns
X[cat_columns] = X[cat_columns].apply(lambda x: x.cat.codes)
X
```

Now, `X`

is an integer matrix. Then, I am trying to convert it into 42-dim sparse tensor.

Each row of the table can be regarded as a 42-dim coordinate where the tensor has 1 value. The tensor values in other indices are 0.

```
X_np = np.transpose(X.to_numpy())
(D, N) = np.shape(X_np)
X_np = X_np.astype(np.int32)
X_th = torch.from_numpy(X_np)
X_torch = torch.sparse_coo_tensor(X_th, np.ones(N) )
```

However, I got the following error:

RuntimeError

Traceback (most recent call last)

in

----> 1 X_torch = torch.sparse_coo_tensor(X_th, np.ones(N) )

RuntimeError: numel: integer multiplication overflow

When the number of features of table data is less, I can use `torch.sparse_coo_tensor`

in this way. For example, just changing `fetch_ucirepo(id=26)`

into `fetch_ucirepo(id=101)`

(TicTacToe dataset), it works.

I am just confused because, essentially, the data quantity of a sparse tensor is not huge, even if it is highly dimensional. So, why does this method not work? What is the alternative to treating high-dimensional sparse tensors?

**Note**: The use of the UCI dataset above is for illustrative purposes, not because I want to turn table data into a tensor. I am looking for a way to handle high-dimensional sparse tensors in Torch.