Implements distributed training methods, including data parallelism and pipeline parallelism across multiple GPUs.
May 31, 2025