Skip to content
Draft
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/ko/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@
- local: perf_train_cpu
title: CPUμ—μ„œ ν›ˆλ ¨
- local: perf_train_special
title: Apple μ‹€λ¦¬μ½˜μ—μ„œ PyTorch ν•™μŠ΅
title: Apple μ‹€λ¦¬μ½˜
- local: in_translation
title: (λ²ˆμ—­μ€‘) Intel Gaudi
- local: perf_hardware
Expand Down
52 changes: 10 additions & 42 deletions docs/source/ko/perf_train_special.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
Expand All @@ -13,51 +13,19 @@ rendered properly in your Markdown viewer.

-->

# Apple μ‹€λ¦¬μ½˜μ—μ„œ Pytorch ν•™μŠ΅ [[PyTorch training on Apple silicon]]
# Apple μ‹€λ¦¬μ½˜[[apple-silicon]]

μ΄μ „μ—λŠ” Macμ—μ„œ λͺ¨λΈμ„ ν•™μŠ΅ν•  λ•Œ CPU만 μ‚¬μš©ν•  수 μžˆμ—ˆμŠ΅λ‹ˆλ‹€. κ·ΈλŸ¬λ‚˜ 이제 PyTorch v1.12의 μΆœμ‹œλ‘œ Apple의 μ‹€λ¦¬μ½˜ GPUλ₯Ό μ‚¬μš©ν•˜μ—¬ 훨씬 더 λΉ λ₯Έ μ„±λŠ₯으둜 λͺ¨λΈμ„ ν•™μŠ΅ν•  수 있게 λ˜μ—ˆμŠ΅λ‹ˆλ‹€. μ΄λŠ” Pytorchμ—μ„œ Apple의 Metal Performance Shaders (MPS)λ₯Ό λ°±μ—”λ“œλ‘œ ν†΅ν•©ν•˜λ©΄μ„œ κ°€λŠ₯ν•΄μ‘ŒμŠ΅λ‹ˆλ‹€. [MPS λ°±μ—”λ“œ](https://pytorch.org/docs/stable/notes/mps.html)λŠ” Pytorch 연산을 Metal μ„Έμ΄λ”λ‘œ κ΅¬ν˜„ν•˜κ³  이 λͺ¨λ“ˆλ“€μ„ mps μž₯μΉ˜μ—μ„œ μ‹€ν–‰ν•  수 μžˆλ„λ‘ μ§€μ›ν•©λ‹ˆλ‹€.
Apple μ‹€λ¦¬μ½˜(M μ‹œλ¦¬μ¦ˆ)은 톡합 λ©”λͺ¨λ¦¬ μ•„ν‚€ν…μ²˜λ₯Ό 기반으둜 ν•˜μ—¬, λŒ€κ·œλͺ¨ λͺ¨λΈμ„ 둜컬 ν™˜κ²½μ—μ„œ 효율적으둜 ν•™μŠ΅ν•  수 μžˆλ„λ‘ μ„€κ³„λ˜μ—ˆμŠ΅λ‹ˆλ‹€. λ˜ν•œ 데이터 μ ‘κ·Ό 지연을 쀄여 μ „λ°˜μ μΈ μ„±λŠ₯ ν–₯상에 κΈ°μ—¬ν•©λ‹ˆλ‹€. [Metal Performance Shaders (MPS)](https://pytorch.org/docs/stable/notes/mps.html)μ™€μ˜ 톡합 덕뢄에, PyTorch둜 λͺ¨λΈμ„ ν•™μŠ΅ν•  λ•Œ μ΄λŸ¬ν•œ ν•˜λ“œμ›¨μ–΄μ  이점을 κ·ΈλŒ€λ‘œ ν™œμš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

<Tip warning={true}>
`mps` λ°±μ—”λ“œλ₯Ό μ‚¬μš©ν•˜λ €λ©΄ macOS 12.3 이상 버전이 ν•„μš”ν•©λ‹ˆλ‹€.

일뢀 Pytorch 연산듀은 아직 MPSμ—μ„œ μ§€μ›λ˜μ§€ μ•Šμ•„ 였λ₯˜κ°€ λ°œμƒν•  수 μžˆμŠ΅λ‹ˆλ‹€. 이λ₯Ό λ°©μ§€ν•˜λ €λ©΄ ν™˜κ²½ λ³€μˆ˜ `PYTORCH_ENABLE_MPS_FALLBACK=1` λ₯Ό μ„€μ •ν•˜μ—¬ CPU 컀널을 λŒ€μ‹  μ‚¬μš©ν•˜λ„λ‘ ν•΄μ•Ό ν•©λ‹ˆλ‹€(μ΄λ•Œ `UserWarning`이 μ—¬μ „νžˆ ν‘œμ‹œλ  수 μžˆμŠ΅λ‹ˆλ‹€).
> [!WARNING]
> 일뢀 PyTorch 연산은 아직 MPSμ—μ„œ κ΅¬ν˜„λ˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. 였λ₯˜λ₯Ό λ°©μ§€ν•˜λ €λ©΄ ν™˜κ²½ λ³€μˆ˜ `PYTORCH_ENABLE_MPS_FALLBACK=1`을 μ„€μ •ν•˜μ—¬ CPU μ»€λ„λ‘œ λŒ€μ²΄ μ‹€ν–‰λ˜λ„λ‘ ν•˜μ„Έμš”. λ‹€λ₯Έ λ¬Έμ œκ°€ λ°œμƒν•˜λ©΄ [PyTorch](https://github.com/pytorch/pytorch/issues) μ €μž₯μ†Œμ— 이슈λ₯Ό 등둝해 μ£Όμ„Έμš”.

<br>
[`TrainingArguments`]와 [`Trainer`]λŠ” Apple μ‹€λ¦¬μ½˜ κΈ°κΈ°λ₯Ό κ°μ§€ν•˜λ©΄, μžλ™μœΌλ‘œ λ°±μ—”λ“œ λ””λ°”μ΄μŠ€λ₯Ό `mps`둜 μ„€μ •ν•˜λ―€λ‘œ, λ³„λ„μ˜ μ„€μ • 없이 ν•΄λ‹Ή κΈ°κΈ°μ—μ„œ λ°”λ‘œ ν•™μŠ΅μ„ μ§„ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

λ‹€λ₯Έ 였λ₯˜κ°€ λ°œμƒν•  경우 [PyTorch](https://github.com/pytorch/pytorch/issues) 리포지토리에 이슈λ₯Ό λ“±λ‘ν•΄μ£Όμ„Έμš”. ν˜„μž¬ [`Trainer`]λŠ” MPS λ°±μ—”λ“œλ§Œ ν†΅ν•©ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.
`mps` λ°±μ—”λ“œλŠ” [λΆ„μ‚° ν•™μŠ΅(distributed training)](https://pytorch.org/docs/stable/distributed.html#backends)을 μ§€μ›ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

</Tip>
## 자료[[resources]]

`mps` μž₯치λ₯Ό μ΄μš©ν•˜λ©΄ λ‹€μŒκ³Ό 같은 이점듀을 얻을 수 μžˆμŠ΅λ‹ˆλ‹€:

* λ‘œμ»¬μ—μ„œ 더 큰 λ„€νŠΈμ›Œν¬λ‚˜ 배치 크기둜 ν•™μŠ΅ κ°€λŠ₯
* GPU의 톡합 λ©”λͺ¨λ¦¬ μ•„ν‚€ν…μ²˜λ‘œ 인해 λ©”λͺ¨λ¦¬μ— 직접 μ ‘κ·Όν•  수 μžˆμ–΄ 데이터 λ‘œλ”© μ§€μ—° κ°μ†Œ
* ν΄λΌμš°λ“œ 기반 GPUλ‚˜ μΆ”κ°€ GPUκ°€ ν•„μš” μ—†μœΌλ―€λ‘œ λΉ„μš© 절감 κ°€λŠ₯

Pytorchκ°€ μ„€μΉ˜λ˜μ–΄ μžˆλŠ”μ§€ ν™•μΈν•˜κ³  μ‹œμž‘ν•˜μ„Έμš”. MPS 가속은 macOS 12.3 μ΄μƒμ—μ„œ μ§€μ›λ©λ‹ˆλ‹€.

```bash
pip install torch torchvision torchaudio
```

[`TrainingArguments`]λŠ” `mps` μž₯μΉ˜κ°€ μ‚¬μš© κ°€λŠ₯ν•œ 경우 이λ₯Ό 기본적으둜 μ‚¬μš©ν•˜λ―€λ‘œ μž₯치λ₯Ό λ”°λ‘œ μ„€μ •ν•  ν•„μš”κ°€ μ—†μŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, MPS λ°±μ—”λ“œλ₯Ό μžλ™μœΌλ‘œ ν™œμ„±ν™”ν•˜μ—¬ [run_glue.py](https://github.com/huggingface/transformers/blob/main/examples/pytorch/text-classification/run_glue.py) 슀크립트λ₯Ό 아무 μˆ˜μ • 없이 μ‹€ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

```diff
export TASK_NAME=mrpc

python examples/pytorch/text-classification/run_glue.py \
--model_name_or_path google-bert/bert-base-cased \
--task_name $TASK_NAME \
- --use_mps_device \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 32 \
--learning_rate 2e-5 \
--num_train_epochs 3 \
--output_dir /tmp/$TASK_NAME/ \
--overwrite_output_dir
```

`gloco`와 `nccl`κ³Ό 같은 [λΆ„μ‚° ν•™μŠ΅ λ°±μ—”λ“œ](https://pytorch.org/docs/stable/distributed.html#backends)λŠ” `mps` μž₯μΉ˜μ—μ„œ μ§€μ›λ˜μ§€ μ•ŠμœΌλ―€λ‘œ, MPS λ°±μ—”λ“œμ—μ„œλŠ” 단일 GPU둜만 ν•™μŠ΅μ΄ κ°€λŠ₯ν•©λ‹ˆλ‹€.

Macμ—μ„œ κ°€μ†λœ PyTorch ν•™μŠ΅μ— λŒ€ν•œ 더 μžμ„Έν•œ λ‚΄μš©μ€ [Introducing Accelerated PyTorch Training on Mac](https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/) λΈ”λ‘œκ·Έ κ²Œμ‹œλ¬Όμ—μ„œ 확인할 수 μžˆμŠ΅λ‹ˆλ‹€.
MPS λ°±μ—”λ“œμ— λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ€ [Introducing Accelerated PyTorch Training on Mac](https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/) λΈ”λ‘œκ·Έ κΈ€μ—μ„œ ν™•μΈν•˜μ‹€ 수 μžˆμŠ΅λ‹ˆλ‹€.</file>
Comment thread
D15M4S marked this conversation as resolved.
Outdated