We are excited to introduce a Beta version of the Universal Sparse Tensor (UST) in NVIDIA nvmath-python:
Besides the online documentation, we also posted two blog postings on the UST:
- Establishing a Scalable Sparse Ecosystem with the Universal Sparse Tensor
- Simplify Sparse Deep Learning with Universal Sparse Tensor in nvmath-python
The UST uses a tensor format DSL (Domain Specific Language) to describe how the sparse tensor should be represented and uses type polymorphism on a small set of base operations to define the vast space of instances for these operations. Developers merely focus on the sparsity of a tensor. The implementation in nvmath-python provides (1) zero-cost interoperability with PyTorch and (2) transparent injection into PyTorch models.
Regarding (1), a PyTorch tensor can be converted to a UST tensor and back without data movement or copying. Once converted, all the UST features become available to the original PyTorch tensor, as illustrated below.
from nvmath.sparse.ust import Tensor
a = torch.eye(10, 10, dtype=torch.float64).to_sparse_csr()
u = Tensor.from_package(a) # u is now a UST representation of a
print(u)
# ...
# format : [i, j] -> (i: <LevelFormat.DENSE>, j: <LevelFormat.COMPRESSED>)
# ...
# ... you can now use all the UST features, like printing, drawing, and
# polymorphic matmul with planning/execution phase...
a = u.to_package() # back in PyTorch land
However, to avoid having to deal with the UST explicitly, the implementation also provides (2) transparent injection into PyTorch using the TorchUST class, which behaves like a PyTorch tensor, but transparently use an UST implementation for the actual operations (if implemented). This is illustrated below:
import torch
from nvmath.sparse.ust.interfaces.torch_interface import TorchUST
# Construct two torch vectors.
x = (1.0 + torch.arange(32)).cuda()
y = (2.0 + torch.arange(32)).cuda()
# Perform a dot product.
z = torch.dot(x, y)
# Convert x to UST wrapped as TorchUST object
# (yes, the UST also includes dense vectors or tensors!)
x = TorchUST.from_torch(x)
# Now perform the same dot production operation.
# It transparently uses the UST implementation!
z = torch.dot(x, y)
A convenience method is provided that allows injecting the UST into all weight matrices of a model as follows.
weights = torchvision.models.get_model_weights(model_name).DEFAULT
model = torchvision.models.get_model(model_name, weights=weights)
model.to(device)
model.eval()
…
reformat_model(model, func=reformat)
…
with torch.inference_mode():
prediction = model(batch)
Here the reformatting function is written by the user and has the following form (note that we could even prune inside this function, but it is more common to use a pruning frameworks like torch.nn.utils.prune in combination with fine-tuning for accuracy before calling the reformat method to prepare inference). The UST now also allows for conversions into formats that are not supported by sparse PyTorch yet, such as diagonal forms or other novel formats. Hopefully this enables research into exploiting structure in weight matrices that were previously left unexplored.
def reformat(weight):
# Inspect sparsity.
nel = weight.numel()
nnz = torch.count_nonzero(weight)
sparsity = (1.0 - float(nnz) / float(nel)) * 100.0
if sparsity >= sparse_threshold:
# … pick suitable format for weight …
return TorchUST.from_torch(weight)
return None
Note that the UST is still in Beta, so a lot of functionality is still missing. But hopefully this is a first step towards a long objective of mine to treat sparsity as a property of tensors, and not as a tedious implementation detail. Let us know your thoughts and please send us ideas for improvements!



