AI Compilers and Kernels
Leverage our across-the-stack expertise in AI, spanning from compilers to applications. We offer MLIR-based compilation to accelerate generative AI and vision model inference on custom hardware. Our team specializes in hardware-aware model compression and the development of custom kernels for low-latency execution. Additionally, we have deep experience in extending LLVM to support custom RISC-V instructions, including the implementation of intrinsics for targeting bespoke SIMD and vector architectures.
Our Expertise
The Compiler Team at 10xE excels in enhancing and customizing compiler infrastructures to meet the unique demands of modern hardware. Our capabilities include:

ML Model Optimization
- Expertise in vision and LLM/VLM model compression techniques, including hardware-aware and hardware-agnostic quantization. pruning, knowledge distillation, etc for efficient deployment
- Skilled in porting ML models and maintaining model-zoo on RISC-V Vector (RVV), custom accelerators, and other specialized hardware backends with application pipelines.
- Development of optimized custom ML kernels in C, C++, CUDA and Triton tailored for non-standard or proprietary hardware.
- Development experience in ML frameworks (Pytorch, ONNX, TF) , compilers (IREE), custom libraries (Llama.cpp, ONNX execution provider), and bare-metal development for inference optimization.

MLIR Progressive Lowering
- Development of custom MLIR passes and dialects, including integrating new operations tailored for unique hardware targets.
- In-depth knowledge of lowering ML workloads through progressive stages for custom hardware backends.
- Experience in the development of IREE-based compiler flows targeting custom hardware
- Bare-metal development for custom hardware

LLVM
- Implementation and integration of RISC-V custom instructions into the LLVM backend.
- Development of custom intrinsics to target custom instructions
- Design of custom analysis and optimization passes to tune performance for target architectures.
- Toolchain validation and performance profiling using industry-standard benchmarks, including SPEC CPU 2017.
Key Services
Custom Compiler Development for Specialized Hardware
ML compiler development using MLIR and LLVM for custom hardware targets
Bare-Metal & Kernel Development for ML Workloads
Hardware-specific bare-metal and ML kernel development
Efficient Deployment of Compressed Vision & GenAI Models
Custom vision and GenAI (LLM/VLM) model compression development and deployment on standard and custom hardware
End-to-End Compiler Optimization & Tuning
End-to-end compiler optimization including custom ops, builtins, and size/performance tuning passes
Clients Testimonials
