Skip to content

Commit 3742017

Browse files
committed
Add RaspberryPi Tutorials to deploy & infer llama model
1 parent fca0f38 commit 3742017

File tree

3 files changed

+348
-1
lines changed

3 files changed

+348
-1
lines changed

docs/source/edge-platforms-section.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ Deploy ExecuTorch on Android devices with hardware acceleration support.
1212
**→ {doc}`android-section` — Complete Android deployment guide**
1313

1414
Key features:
15+
1516
- Hardware acceleration support (CPU, GPU, NPU)
1617
- Multiple backend options (XNNPACK, Vulkan, Qualcomm, MediaTek, ARM, Samsung)
1718
- Comprehensive examples and demos

docs/source/embedded-section.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Start here for C++ development with ExecuTorch runtime APIs and essential tutori
2525
## Tutorials
2626

2727
- {doc}`tutorial-arm-ethos-u` — Export a simple PyTorch model for the ExecuTorch Ethos-U backend
28-
28+
- {doc}`raspberry_pi_llama_tutorial` — Deploy a LLaMA model on a Raspberry Pi with the ExecuTorch Ethos-U backend
2929

3030
```{toctree}
3131
:hidden:
@@ -37,3 +37,4 @@ using-executorch-cpp
3737
using-executorch-building-from-source
3838
embedded-backends
3939
tutorial-arm-ethos-u
40+
raspberry_pi_llama_tutorial
Lines changed: 345 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,345 @@
1+
# ExecuTorch on Raspberry Pi
2+
3+
## TLDR
4+
5+
This tutorial demonstrates how to deploy **Llama models on Raspberry Pi 4/5 devices** using ExecuTorch:
6+
7+
- **Prerequisites**: Linux host machine, Python 3.10-3.12, conda environment, Raspberry Pi 4/5
8+
- **Setup**: Automated cross-compilation using `setup.sh` script for ARM toolchain installation
9+
- **Export**: Convert Llama models to optimized `.pte` format with quantization options
10+
- **Deploy**: Transfer binaries to Raspberry Pi and configure runtime libraries
11+
- **Optimize**: Build optimization and performance tuning techniques
12+
- **Result**: Efficient on-device Llama inference
13+
14+
## Prerequisites and Hardware Requirements
15+
16+
### Host Machine Requirements
17+
18+
**Operating System**: Linux x86_64 (Ubuntu 20.04+ or CentOS Stream 9+)
19+
20+
**Software Dependencies**:
21+
22+
- **Python 3.10-3.12** (ExecuTorch requirement)
23+
- **conda** or **venv** for environment management
24+
- **CMake 3.29.6+** for cross-compilation
25+
- **Git** for repository cloning
26+
27+
### Target Device Requirements
28+
29+
**Supported Devices**: **Raspberry Pi 4** and **Raspberry Pi 5** with **64-bit OS**
30+
31+
**Memory Requirements**:
32+
33+
- **Minimum 4GB RAM** (8GB recommended for larger models)
34+
- **8GB+ storage** for model files and binaries
35+
- **64-bit Raspberry Pi OS** (Bullseye or newer)
36+
37+
### Verification Commands
38+
39+
Verify your host machine compatibility:
40+
```bash
41+
# Check OS and architecture
42+
uname -s # Should output: Linux
43+
uname -m # Should output: x86_64
44+
45+
# Check Python version
46+
python3 --version # Should be 3.10-3.12
47+
48+
# Check required tools
49+
which cmake git md5sum
50+
cmake --version # Should be 3.29.6+ at minimum
51+
52+
## Development Environment Setup
53+
54+
### Clone ExecuTorch Repository
55+
56+
First, clone the ExecuTorch repository with the Raspberry Pi support:
57+
58+
```bash
59+
# Create project directory
60+
mkdir ~/executorch-rpi && cd ~/executorch-rpi
61+
62+
# Clone ExecuTorch repository
63+
git clone -b release/1.0 https://github.com/pytorch/executorch.git
64+
cd executorch
65+
```
66+
67+
### Create Conda Environment
68+
69+
```bash
70+
# Create conda environment
71+
conda create -yn executorch python=3.10.0
72+
conda activate executorch
73+
74+
# Upgrade pip
75+
pip install --upgrade pip
76+
```
77+
78+
Alternative: Virtual Environment
79+
If you prefer Python's built-in virtual environment:
80+
81+
```bash
82+
python3 -m venv .venv
83+
source .venv/bin/activate
84+
pip install --upgrade pip
85+
```
86+
87+
Refer to → {doc}`getting-started` for more details.
88+
89+
## Cross-Compilation Toolchain Setup
90+
91+
Run the following automated cross compile script on your Linux host machine:
92+
93+
```bash
94+
# Run the Raspberry Pi setup script for Pi 5
95+
examples/raspberry_pi/setup.sh pi5
96+
97+
[100%] Linking CXX executable llama_main
98+
[100%] Built target llama_main
99+
[SUCCESS] LLaMA runner built successfully
100+
101+
==== Verifying Build Outputs ====
102+
[SUCCESS] ✓ llama_main (6.1M)
103+
[SUCCESS] ✓ libllama_runner.so (4.0M)
104+
[SUCCESS] ✓ libextension_module.a (89K) - static library
105+
106+
✓ ExecuTorch cross-compilation setup completed successfully!
107+
```
108+
109+
## Model Preparation and Export
110+
111+
### Download Llama Models
112+
113+
Download the Llama model from Hugging Face or any other source, and make sure that following files exist.
114+
115+
- consolidated.00.pth (model weights)
116+
- params.json (model config)
117+
- tokenizer.model (tokenizer)
118+
119+
### Export Llama to ExecuTorch Format
120+
121+
After downloading the Llama model, export it to ExecuTorch format using the provided script:
122+
123+
```bash
124+
125+
#### Set these paths to point to the exported files. Following is an example instruction to export a llama model
126+
127+
LLAMA_QUANTIZED_CHECKPOINT=path/to/consolidated.00.pth
128+
LLAMA_PARAMS=path/to/params.json
129+
130+
python -m extension.llm.export.export_llm \
131+
--config examples/models/llama/config/llama_xnnpack_spinquant.yaml \
132+
+base.model_class="llama3_2" \
133+
+base.checkpoint="${LLAMA_QUANTIZED_CHECKPOINT:?}" \
134+
+base.params="${LLAMA_PARAMS:?}"
135+
```
136+
137+
The file llama3_2.pte will be generated at the place where you run the command
138+
139+
## Raspberry Pi Deployment
140+
141+
### Transfer Binaries to Raspberry Pi
142+
143+
After successful cross-compilation, transfer the required files:
144+
145+
```bash
146+
##### Set Raspberry Pi details
147+
export RPI_UN="pi" # Your Raspberry Pi username
148+
export RPI_IP="your-rpi-ip-address"
149+
150+
##### Create deployment directory on Raspberry Pi
151+
ssh $RPI_UN@$RPI_IP 'mkdir -p ~/executorch-deployment'
152+
##### Copy main executable
153+
scp cmake-out/examples/models/llama/llama_main $RPI_UN@$RPI_IP:~/executorch-deployment/
154+
##### Copy runtime library
155+
scp cmake-out/examples/models/llama/runner/libllama_runner.so $RPI_UN@$RPI_IP:~/executorch-deployment/
156+
##### Copy model file
157+
scp llama3_2.pte $RPI_UN@$RPI_IP:~/executorch-deployment/
158+
scp ./tokenizer.model $RPI_UN@$RPI_IP:~/executorch-deployment/
159+
```
160+
161+
### Configure Runtime Libraries on Raspberry Pi
162+
163+
SSH into your Raspberry Pi and configure the runtime:
164+
165+
#### Set up library environment
166+
167+
```bash
168+
cd ~/executorch-deployment
169+
echo 'export LD_LIBRARY_PATH=$(pwd):$LD_LIBRARY_PATH' > setup_env.sh
170+
chmod +x setup_env.sh
171+
172+
#### Make executable
173+
174+
chmod +x llama_main
175+
```
176+
177+
## Dry Run
178+
179+
```bash
180+
source setup_env.sh
181+
./llama_main --help
182+
```
183+
184+
Make sure that the output does not have any GLIBC / other library mismatch errors in the output. If you see any, follow the troubleshooting steps below.
185+
186+
## Troubleshooting
187+
188+
### Issue 1: GLIBC Version Mismatch
189+
190+
**Problem:** The binary was compiled with a newer GLIBC version (2.38) than what's available on your Raspberry Pi (2.36).
191+
192+
**Error Symptoms:**
193+
194+
```bash
195+
./llama_main: /lib/aarch64-linux-gnu/libm.so.6: version `GLIBC_2.38' not found (required by ./llama_main)
196+
./llama_main: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by ./llama_main)
197+
./llama_main: /lib/aarch64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.15' not found (required by ./llama_main)
198+
./llama_main: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /lib/libllama_runner.so)
199+
```
200+
201+
#### Solution A: Upgrade GLIBC on Raspberry Pi (Recommended)
202+
203+
1. **Check your current GLIBC version:**
204+
205+
```bash
206+
ldd --version
207+
# Output: ldd (Debian GLIBC 2.36-9+rpt2+deb12u12) 2.36
208+
```
209+
210+
2. **Upgrade to newer GLIBC:**
211+
212+
```bash
213+
# Add Debian unstable repository
214+
echo "deb http://deb.debian.org/debian sid main contrib non-free" | sudo tee -a /etc/apt/sources.list
215+
216+
# Update package lists
217+
sudo apt update
218+
219+
# Install newer GLIBC packages
220+
sudo apt-get -t sid install libc6 libstdc++6
221+
222+
# Reboot system
223+
sudo reboot
224+
```
225+
226+
3. **Test the fix:**
227+
228+
```bash
229+
cd ~/executorch-deployment
230+
source setup_env.sh
231+
./llama_main --model_path ./llama3_2.pte --tokenizer_path ./tokenizer.model --seq_len 128 --prompt "Hello"
232+
```
233+
234+
**Important Notes:**
235+
236+
- Select "Yes" when prompted to restart services
237+
- Press Enter to keep current version for configuration files
238+
- Backup important data before upgrading
239+
240+
#### Solution B: Rebuild with Raspberry Pi's GLIBC (Advanced)
241+
242+
If you prefer not to upgrade your Raspberry Pi system:
243+
244+
1. **Copy Pi's filesystem to host machine:**
245+
```bash
246+
# On Raspberry Pi - install rsync
247+
ssh pi@<your-rpi-ip>
248+
sudo apt update && sudo apt install rsync
249+
exit
250+
251+
# On host machine - copy Pi's filesystem
252+
mkdir -p ~/rpi5-sysroot
253+
rsync -aAXv --exclude={"/proc","/sys","/dev","/run","/tmp","/mnt","/media","/lost+found"} \
254+
pi@<your-rpi-ip>:/ ~/rpi5-sysroot
255+
```
256+
257+
2. **Update CMake toolchain file:**
258+
```bash
259+
# Edit arm-toolchain-pi5.cmake
260+
# Replace this line:
261+
# set(CMAKE_SYSROOT "${TOOLCHAIN_PATH}/aarch64-none-linux-gnu/libc")
262+
263+
# With this:
264+
set(CMAKE_SYSROOT "/home/yourusername/rpi5-sysroot")
265+
set(CMAKE_FIND_ROOT_PATH "${CMAKE_SYSROOT}")
266+
```
267+
268+
3. **Rebuild binaries:**
269+
```bash
270+
# Clean and rebuild
271+
rm -rf cmake-out
272+
./examples/raspberry_pi/rpi_setup.sh pi5 --force-rebuild
273+
274+
# Verify GLIBC version
275+
strings ./cmake-out/examples/models/llama/llama_main | grep GLIBC_
276+
# Should show max GLIBC_2.36 (matching your Pi)
277+
```
278+
279+
---
280+
281+
### Issue 2: Library Not Found
282+
283+
**Problem:** Required libraries are not found at runtime.
284+
285+
**Error Symptoms:**
286+
```bash
287+
./llama_main: error while loading shared libraries: libllama_runner.so: cannot open shared object file
288+
```
289+
290+
**Solution:**
291+
```bash
292+
# Ensure you're in the correct directory and environment is set
293+
cd ~/executorch-deployment
294+
source setup_env.sh
295+
./llama_main --help
296+
```
297+
298+
**Root Cause:** Either `LD_LIBRARY_PATH` is not set or you're not in the deployment directory.
299+
300+
---
301+
302+
### Issue 3: Tokenizer JSON Parsing Warnings
303+
304+
**Problem:** Warning messages about JSON parsing errors after running the llama_main binary.
305+
306+
**Error Symptoms:**
307+
308+
```bash
309+
E tokenizers:hf_tokenizer.cpp:60] Error parsing json file: [json.exception.parse_error.101]
310+
```
311+
312+
**Solution:** These warnings can be safely ignored. They don't affect model inference.
313+
314+
---
315+
316+
317+
## Quick Test Command
318+
319+
After resolving issues, test with:
320+
321+
```bash
322+
cd ~/executorch-deployment
323+
source setup_env.sh
324+
./llama_main --model_path ./llama3_2.pte --tokenizer_path ./tokenizer.model --seq_len 128 --prompt "What is the meaning of life?"
325+
```
326+
327+
## Debugging Tools
328+
329+
Enable ExecuTorch logging:
330+
331+
```bash
332+
# Set log level for debugging
333+
export ET_LOG_LEVEL=Info
334+
./llama_main --model_path ./model.pte --verbose
335+
```
336+
337+
## Final Run command
338+
339+
```bash
340+
cd ~/executorch-deployment
341+
source setup_env.sh
342+
./llama_main --model_path ./llama3_2.pte --tokenizer_path ./tokenizer.model --seq_len 128 --prompt "What is the meaning of life?"
343+
```
344+
345+
Happy Inferencing!

0 commit comments

Comments
 (0)