Apache TVM
Open-source ML compiler framework for diverse hardware.
Open-source ML compiler framework for diverse hardware.
Cross-platform compilation
Compile ML models once, deploy to CPUs, GPUs, TPUs, microcontrollers, FPGAs, and web browsers
Multi-framework support
Import models from PyTorch, TensorFlow, Keras, MXNet, ONNX, and other formats
Automatic optimisation
Generates and tunes tensor operators for target hardware without manual kernel writing
Quantisation and sparsity
Built-in support for model compression techniques including block sparsity and quantisation
Multiple language bindings
Use Python for research and prototyping, C++, Rust, or Java for production
Memory optimisation
Includes memory planning and allocation strategies for constrained devices
Deploying ML models to mobile phones and tablets where computational resources are limited
Running inference on edge devices like smart home devices, IoT sensors, or industrial equipment
Optimising models for specific hardware accelerators to achieve maximum performance
Creating cross-platform ML services that work consistently across different server architectures
Reducing model size and latency for real-time inference applications
Deploying ML models in web browsers for client-side inference without server calls