Introducing Transformer Lab GPU Orchestration, a modern SLURM replacement to run workloads across GPU clusters Read More...

The essential open source workspace for AI/ML teams.

From GPU orchestration to training, fine-tuning and evaluating models across any infrastructure, Transformer Lab is the next generation platform for AI/ML research.

Join the Beta for Transformer Lab GPU Orchestration

A smarter way to train models

Whether you’re a research lab with racks of GPUs, a startup shipping AI products or a solo AI/ML engineer, you face the same friction: outdated SLURM capabilities, compute bottlenecks and fragmented, non-standardized tooling. Transformer Lab automates away the infrastructure headaches so you can iterate faster and focus on your research.

For Individuals

Transformer Lab Local for Single Node Workloads

Train, test, and eval models (LLMs, Diffusion, Audio) on a single node.

Download Now

For teams

Transformer Lab GPU Orchestration

Run workloads on multiple nodes with multi-cloud job scheduling, experiment management, and monitoring. A modern SLURM replacement built on SkyPilot.

Join the Beta

NEW Deploy across Multiple Clouds

Connect to AWS, Azure, GCP, Runpod or any of 20+ cloud providers alongside your on-premise clusters. Your teams see one unified compute pool. Behind the scenes, we automatically route requests to the lowest cost nodes meeting requirements using battle-tested SkyPilot technology.

NEW Role based access & quota enforcement

Admins set GPU quotas, group priorities, and hardware access controls. See what's actually running. Administrators can track usage, idle nodes, and cost reporting across the org.

NEW Easily run a single training job across n nodes

Getting multiple GPUs to coordinate gradient sharing and parameter synchronization across nodes is a hassle. Automatically handle all the Kubernetes networking, container orchestration, inter-node communication, checkpointing, and failover.

Generate and Train LLMs, Diffusion and Audio Models

One-click download and run the latest open-source models. Diffusion model support for Inpainting, Img2img, ControlNets, LoRAs, auto-caption of images, batch image generation and more. Audio model support includes: turning text to speech, fine-tuning TTS models on your own dataset, one-shot cloning from a single reference sample.

Pre-training, Finetuning, RLHF and Preference Optimization

Build advanced models from scratch or fine-tune existing ones with production-ready implementations of DPO, ORPO, SIMPO, and GRPO that work out of the box. No more wrestling with framework incompatibilities or debugging distributed training setups. Complete RLHF pipeline with reward modeling handles the complex orchestration automatically, from data processing to final model outputs.

Comprehensive Evals

Measure what matters with built-in evaluation tools. Run Eleuther Harness benchmarks, LLM-as-a-Judge comparisons and objective metrics in one place. Red-team your models, visualize results over time, and compare runs across experiments with clean, exportable dashboards.

Trusted by Innovative Teams

“Transformer Lab has made it easy for me to experiment and use LLMs in a completely private fashion.”

Ramanan Sivaranjan, Head of the Engineering at Quantum Bridge

“The essential open-source stack for serious ML teams”

Elena Yunusov, Executive Director at Human Feedback Foundation

“SLURM is outdated, and Transformer Lab is the future.”

Jash Mehta, Applied Research Scientist at ServiceNow Research