Skip to main content

Introducing Transformer Lab GPU Orchestration, a modern SLURM replacement to run workloads across GPU clusters   Read More...

The essential open source workspace for AI/ML teams.

From GPU orchestration to training, fine-tuning and evaluating models across any infrastructure, Transformer Lab is the next generation platform for AI/ML research.

or Download Transformer Lab Local

A smarter way to train models

Whether you’re a research lab with racks of GPUs, a startup shipping AI products or a solo AI/ML engineer, you face the same friction: outdated SLURM capabilities, compute bottlenecks and fragmented, non-standardized tooling. Transformer Lab automates away the infrastructure headaches so you can iterate faster and focus on your research.

For Individuals

Transformer Lab Local for Single Node Workloads

Train, test, and eval models (LLMs, Diffusion, Audio) on a single node.

For teams

Transformer Lab GPU Orchestration

Run workloads on multiple nodes with multi-cloud job scheduling, experiment management, and monitoring. A modern SLURM replacement built on SkyPilot.

Join the Beta
Cloud Architecture

NEW Deploy across Multiple Clouds

Connect to AWS, Azure, GCP, Runpod or any of 20+ cloud providers alongside your on-premise clusters. Your teams see one unified compute pool. Behind the scenes, we automatically route requests to the lowest cost nodes meeting requirements using battle-tested SkyPilot technology.

Quota Management

NEW Role based access & quota enforcement

Admins set GPU quotas, group priorities, and hardware access controls. See what's actually running. Administrators can track usage, idle nodes, and cost reporting across the org.

Job Management

NEW Easily run a single training job across n nodes

Getting multiple GPUs to coordinate gradient sharing and parameter synchronization across nodes is a hassle. Automatically handle all the Kubernetes networking, container orchestration, inter-node communication, checkpointing, and failover.

Generate and Train LLMs, Diffusion and Audio Models

One-click download and run the latest open-source models. Diffusion model support for Inpainting, Img2img, ControlNets, LoRAs, auto-caption of images, batch image generation and more. Audio model support includes: turning text to speech, fine-tuning TTS models on your own dataset, one-shot cloning from a single reference sample.

Pre-training, Finetuning, RLHF and Preference Optimization

Build advanced models from scratch or fine-tune existing ones with production-ready implementations of DPO, ORPO, SIMPO, and GRPO that work out of the box. No more wrestling with framework incompatibilities or debugging distributed training setups. Complete RLHF pipeline with reward modeling handles the complex orchestration automatically, from data processing to final model outputs.

Comprehensive Evals

Measure what matters with built-in evaluation tools. Run Eleuther Harness benchmarks, LLM-as-a-Judge comparisons and objective metrics in one place. Red-team your models, visualize results over time, and compare runs across experiments with clean, exportable dashboards.

Trusted by Innovative Teams

Ramanan Sivaranjan, Head of the Engineering at Quantum Bridge
Transformer Lab has made it easy for me to experiment and use LLMs in a completely private fashion.
Ramanan Sivaranjan, Head of the Engineering at Quantum Bridge
Elena Yunusov, Executive Director at Human Feedback Foundation
The essential open-source stack for serious ML teams
Elena Yunusov, Executive Director at Human Feedback Foundation
Jash Mehta, Applied Research Scientist at ServiceNow Research
SLURM is outdated, and Transformer Lab is the future.
Jash Mehta, Applied Research Scientist at ServiceNow Research