Implements distributed training methods, including data parallelism and pipeline parallelism across multiple GPUs.
Oct 31, 2025