Summary
We are the XuanTie team from Alibaba DAMO Academy, and we would like to propose a roadmap for enabling native RISC-V architecture support in PyTorch. This RFC aims to:
- Express our commitment to contributing RISC-V support to PyTorch
- Share our proposed roadmap for implementation
- Gather community feedback and identify potential blockers
- Collaborate with the PyTorch community to make RISC-V a first-class citizen
Motivation
Why Now?
RISC-V is an open-source instruction set architecture (ISA) that has been gaining significant momentum in the industry. Historically, RISC-V’s AI efforts have been primarily focused on edge/embedded scenarios, where lightweight inference runtimes were sufficient and native PyTorch support was not a priority.
However, the landscape is rapidly changing:
-
High-Performance RISC-V SoCs Are Emerging: The RISC-V ecosystem has made remarkable progress in high-performance computing. Multi-core, high-frequency RISC-V processors with advanced vector extensions (RVV 1.0) and matrix extensions (RVM) are now becoming available.
-
Server-Class RISC-V CPUs Are on the Horizon: Several companies are developing server-grade RISC-V processors targeting data center workloads. This opens up new possibilities for running full AI/ML software stacks directly on RISC-V hardware.
-
The Timing Is Right: With these hardware advancements, the conditions for running PyTorch natively on RISC-V are now being met. Establishing PyTorch support now will ensure the ecosystem is ready when server-class RISC-V hardware becomes widely available.
The Need for Native Support
Currently, PyTorch has limited support for RISC-V, which creates barriers for:
- Researchers and developers working on RISC-V-based AI hardware
- Companies developing RISC-V AI accelerators
- The broader RISC-V ecosystem that wants to leverage PyTorch’s capabilities
By enabling comprehensive RISC-V support, we can:
- Expand PyTorch’s reach to the growing RISC-V ecosystem
- Enable AI workloads on RISC-V-based edge devices and servers
- Foster innovation in open-source hardware and software for AI
Proposed Roadmap
Phase 1: CI Infrastructure and Validation
Goal: Establish a robust CI pipeline for RISC-V builds and testing
-
Complete CI Execution Flow Validation
- Ensure PyTorch can be built and tested on RISC-V architecture
- Validate all existing test suites pass on RISC-V targets
- Identify and fix architecture-specific issues
-
RISC-V Development Board Contribution
- We are prepared to provide high-performance RISC-V development boards for PyTorch’s CI infrastructure
- This will enable continuous testing and validation of RISC-V support
- We will work with the PyTorch infrastructure team to integrate these boards
Phase 2: High-Performance Micro-Kernel Library
Goal: Deliver optimized implementations for core ATen operators
-
Develop a RISC-V Optimized uKernel Library
- Similar to KleidiAI for ARM architecture
- Leverage RISC-V Vector Extension (RVV) and Matrix Extension (RVM) for SIMD operations
- Target core ATen operators first (Matmul, convolution, etc.)
-
Incremental Operator Coverage
- Start with the most commonly used operators in popular models
- Gradually expand coverage to remaining operators
-
Performance Benchmarking
- Establish baseline performance metrics
- Compare with reference implementations
- Continuous performance regression testing
Phase 3: torch.compile Backend Extension
Goal: Enable high-performance graph compilation for RISC-V
-
Develop RISC-V Backend for torch.compile
- Implement code generation targeting RISC-V architecture
- Support RVV/RVM vectorization in generated code
- Enable operator fusion and other graph-level optimizations
-
Integration with Inductor
- Work with the PyTorch compiler team to integrate RISC-V support
- Ensure compatibility with existing compilation pipelines
- Support both eager and compiled execution modes
Phase 4: Triton/TileLang Native Support
Goal: Enable native Triton kernel development for RISC-V
-
Triton Backend for RISC-V
- Develop RISC-V code generation backend for Triton
- Support RISC-V RVV/RVM in Triton kernels
- Enable custom kernel development for RISC-V targets
-
TileLang Integration
- Explore TileLang support for RISC-V architecture
- Enable tile-based programming model for RISC-V AI accelerators
Phase 5: LLM Inference Framework Support
Goal: Enable large language model deployment on RISC-V
-
vLLM Support
- Port vLLM to RISC-V architecture
-
SGLang Support
- Enable SGLang runtime on RISC-V
Questions for the Community
We would like to gather feedback on the following:
-
Technical Blockers
- What are the other known technical challenges for RISC-V support in PyTorch?
-
CI Infrastructure
- We are able to provide RISC-V development boards for community CI testing. What is the recommended approach to integrate these boards into PyTorch’s existing GitHub Actions CI pipeline?
- What are the requirements for contributing RISC-V hardware to PyTorch CI?
- What is the preferred approach for integrating new architecture support?
-
Prioritization
- Which operators/features should be prioritized for RISC-V optimization?
- Are there specific use cases the community would like to see supported first?
-
Collaboration
- Are there other teams/companies working on RISC-V support for PyTorch?
- How can we best coordinate efforts to avoid duplication?
-
Testing and Validation
- What test coverage is expected for new architecture support?
- Are there specific benchmarks we should target?
Our Commitment
We are committed to:
- Contributing high-quality, well-tested code
- Maintaining RISC-V support long-term
- Providing hardware resources for CI infrastructure
- Collaborating openly with the PyTorch community
- Following PyTorch’s contribution guidelines and code standards
Call for Feedback
We welcome any feedback, suggestions, or concerns from the PyTorch community. Our goal is to work together to make RISC-V a well-supported architecture in PyTorch, benefiting both the PyTorch and RISC-V ecosystems.
Please share your thoughts on:
- The proposed roadmap
- Technical considerations we may have missed
- Potential collaboration opportunities
- Any blockers or concerns you foresee
We look forward to working with the community to bring native RISC-V support to PyTorch!