Distributed training using DeepSpeed

2025-09-29

Important

This feature is in Beta.

This page has notebook examples for distributed training using DeepSpeed on Serverless GPU compute.

Supervised fine-tuning using DeepSpeed and TRL

This notebook demonstrates how to use the Serverless GPU Python API to run supervised fine-tuning (SFT) using the Transformer Reinforcement Learning (TRL) library with DeepSpeed ZeRO Stage 3 optimization.

Notebook

Get notebook

Feedback

Was this page helpful?

Share via

Distributed training using DeepSpeed

Supervised fine-tuning using DeepSpeed and TRL

Notebook

Feedback

Additional resources