Cherry pick Explicitly pass expert_tensor_parallel_size to initialize_model_parallel (537) into r0.1.0
#557
Loading