MSCCL++: A GPU-driven communication stack for scalable AI applications
Updated 2026-04-24 06:48:25 +00:00
CUDA Templates and Python DSLs for High-Performance Linear Algebra
Updated 2026-04-21 16:32:40 +00:00
General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
Updated 2025-02-20 20:54:13 +00:00