This small board combines 512MB LPDDR2 RAM with the BCM2837 die known from RPi 3B as SiP (system-in-package). It is the much more powerful successor to RPi Zero W and replaces a single ARM11 core with a quad core Cortex-A53 clocking by default between 600 MHz and 1000 MHz. Wireless capabilities remain almost the same compared to Zero W (minor BT revision update) and the chip is from the same family as before (Broadcom, then Cypress, now Synaptics) which is good news since RPi Trading Ltd. unlike almost all the other SBC makers out there fix wireless security flaws by pushing new firmware BLOBs to their distribution.
Board dimensions and connector placement is the same except location of most pogo pins on the backside. For pictures visit Jeff Geerling (but please ignore benchmark numbers, procedures and recommendations there). For a good comparison between both boards see CNX-Software (but please ignore the "2.5A power requirement" there).
Why looking at this little SBC thrown out recently by RPi Trading Ltd.? Since it's one of the few SBC that have a lower idle consumption than my laptop.
Setting up this thing is straightforward as always. You flash latest Raspberry Pi OS (at the time of this writing that's the Bullseye release) to a quality A1 rated SD card, afterwards eject/re-insert the card to ensure the FAT partition will be mounted. In my example as /Volumes/boot/
. It's then as easy as
touch /Volumes/boot/ssh
echo -e "\ndtoverlay=dwc2" >>/Volumes/boot/config.txt
nano /Volumes/boot/cmdline.txt
There you carefully edit the single line and insert modules-load=dwc2,g_ether
after rootwait
. Then you insert the SD card into the board and use an USB cable/adapter to connect the board's OTG port (not the power port at the edge!) to one of your computer's USB3 ports (USB3 since providing 900mA instead of 500mA):
The above procedure results in the Zero using USB Gadget mode to appear as a network adapter on the USB port:
RNDIS/Ethernet Gadget:
Product ID: 0xa4a2
Vendor ID: 0x0525 (PLX Technology, Inc.)
Version: 5.10
Speed: Up to 480 Mb/s
Manufacturer: Linux 5.10.63-v7+ with 3f980000.usb
Location ID: 0x14100000 / 2
Current Available (mA): 500
Current Required (mA): 2
Extra Operating Current (mA): 0
In case your computer's OS is not crappy you can now simply access the Zero as raspberrypi.local
(no need for stupid assigning of static IP addresses or BS like that):
bash-3.2$ ping raspberrypi.local
PING raspberrypi.local (169.254.188.97): 56 data bytes
Request timeout for icmp_seq 0
64 bytes from 169.254.188.97: icmp_seq=1 ttl=64 time=0.471 ms
64 bytes from 169.254.188.97: icmp_seq=2 ttl=64 time=0.509 ms
64 bytes from 169.254.188.97: icmp_seq=3 ttl=64 time=0.647 ms
^C
--- raspberrypi.local ping statistics ---
4 packets transmitted, 3 packets received, 25.0% packet loss
round-trip min/avg/max/stddev = 0.471/0.542/0.647/0.076 ms
bash-3.2$ ssh [email protected]
The authenticity of host 'raspberrypi.local (fe80::6154:7798:c601:106d%en10)' can't be established.
ECDSA key fingerprint is SHA256:nDJzER2skvqSeWydqBH2FwvFYUfj3Wn3uVYFR49JjQk.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'raspberrypi.local,fe80::6154:7798:c601:106d%en10' (ECDSA) to the list of known hosts.
[email protected]'s password:
Linux raspberrypi 5.10.63-v7+ #1459 SMP Wed Oct 6 16:41:10 BST 2021 armv7l
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Fri Nov 5 14:10:41 2021 from 2001:a61:3a9f:1e01:8c09:8987:d565:9c4b
SSH is enabled and the default password for the 'pi' user has not been changed.
This is a security risk - please login as the 'pi' user and type 'passwd' to set a new password.
Wi-Fi is currently blocked by rfkill.
Use raspi-config to set the country before use.
pi@raspberrypi:~ $
All you need to do now is the following
passwd # set a secure password
sudo nano /etc/hostname # assign a unique name to the board, e.g. 'zero2'
sudo raspi-config # set country code and Wi-Fi details
sudo apt install zram-tools # avoid silly swapping to SD card
sudo shutdown -h now
You might revoke the changes to /boot/config.txt
and /boot/cmdline.txt
(might save you 20mW) but then you'd need to redo this again to flawlessly SSH into your board via the USB port. Otherwise just attach the Zero 2 now to a normal power source using the power Micro USB port and login through Wi-Fi via ssh pi@zero2
(given you assigned this hostname in the step before and you run a non-crappy DNS/DHCP server combo at home / in your lab).
BTW: A better aproach than installing zram-tools
will be discussed at the end of this article.
The Zero2 is way more performant than its predecessor but also needs a little more juice. The official power requirement of [email protected] (12.5W) is BS or most probably RPi Trading Ltd. trying to sell old Micro USB wall warts they've still in stock for RPi 3B/3B+. When CPU cores are fully utilized the consumption does not exceed 2.5W (or 500mA) so even when you also utilize the VideoCore (e.g. to encode a video stream coming from the CSI camera or decoding video to be displayed) and have an USB consumer you'll have a hard time to exceed 5W (1000mA).
Asides that ordering a Micro USB PSU from RPi Trading Ltd. is a good idea if you want to attach USB consumers to Zero 2. Not due to the high amperage rating but since this PSU with a fixed and thick cable really provides stable +5V to the board in contrast to majority of USB charging variants where huge voltage drops under load happen (and this – undervoltage – is the real issue in a 5V powered world).
Zero 2's single threaded performance is higher than the original Zero's even when comparing Zero 2 at 600 MHz with Zero at 1000 MHz. The latter shows a 7-ZIP MIPS score of 450 @ 1000 MHz while with Zero 2 it's as follows iterating through 600-1000 MHz with one CPU core fully utilized:
Sysfs/ThreadX/Tested: MIPS / Temp / Watt
600 / 600 / 600: 511 37.4°C 1280mW
700 / 700 / 680: 593 39.5°C 1390mW
800 / 800 / 800: 676 39.9°C 1510mW
900 / 900 / 900: 755 40.2°C 1570mW
1000 / 1000 / 970: 835 40.6°C 1610mW
When utilizing all 4 cores it looks like this (a fan blowing over the SiP's surface which explains the low temperatures – for temps with 'convection only' see below):
Sysfs/ThreadX/Tested: MIPS / Temp / Watt
600 / 600 / 600: 1870 42.7°C 1700mW
700 / 700 / 700: 2152 45.1°C 1940mW
800 / 800 / 800: 2465 46.9°C 2060mW
900 / 900 / 880: 2750 49.0°C 2170mW
1000 / 1000 / 1000: 3009 50.8°C 2330mW
So if you're concerned about max consumption of this thing and don't need highest single-threaded performance a good idea is to lock down CPU cores to 600 MHz: still same or better single-threaded performance than original Zero but multi-threaded almost 4 times faster.
BTW: Why 7-Zip's internal benchmark? Since being a good represenation of 'server workloads' and not that much a compiler benchmark like majority of other kitchen-sink benchmarks that produce different numbers with every major OS/GCC release (full explanation).
Back to clockspeeds and consumption: if you're running with the older Buster image there's another benefit of staying at 600 MHz. At this clockspeed a lower VCore voltage is used to feed the ARM cores:
Cpufreq OPP: 1000 ThreadX: 1000 Measured: 1000 @ 1.2375V
Cpufreq OPP: 900 ThreadX: 900 Measured: 870 @ 1.2375V
Cpufreq OPP: 800 ThreadX: 800 Measured: 800 @ 1.2375V
Cpufreq OPP: 700 ThreadX: 700 Measured: 660 @ 1.2375V
Cpufreq OPP: 600 ThreadX: 600 Measured: 580 @ 1.2V
In idle that's below 39°C SoC temp (w/o heatsink/fan and at an ambient temp of 23°C) and less than 800mW idle consumption:
Different picture with the newly released Bullseye image: SoC temp +5°C and consumption almost 200mW higher:
Cpufreq OPP: 1000 ThreadX: 1000 Measured: 1000 @ 1.2438V
Cpufreq OPP: 900 ThreadX: 900 Measured: 900 @ 1.2438V
Cpufreq OPP: 800 ThreadX: 800 Measured: 785 @ 1.2438V
Cpufreq OPP: 700 ThreadX: 700 Measured: 700 @ 1.2438V
Cpufreq OPP: 600 ThreadX: 600 Measured: 535 @ 1.2438V
With the Bullseye image the ARM cores are always fed with a higher voltage even when idle and at 600 MHz. But less than 50mV difference are IMO not enough to explain almost 200mW higher idle consumption. Needs more research.
BTW: If you want to use the old Buster image be prepared for 'hacks' (manually adding and renaming the device-tree files with bcm2710-rpi-zero-2
in their name from here) or you need an older RPi model to once boot and perform all updates there and only then the necessary support files for Zero 2 become part of the installation on this SD card. Otherwise the board simply won't boot.
While not the best idea when you want low consumption (why buying a Zero?) it's possible but at the cost of stability or power efficiency. For higher clockspeeds to work stably the main OS needs to be told to increase Vcore voltage for the ARM cores via the over_voltage
parameter in config.txt
(this is the major config file for ThreadX running on the VideoCore and fully controlling the ARM domain – details).
For e.g. 1200 MHz to work stably you might need to define over_voltage=2
which increases Vcore voltage the ARM cores are fed with. A quick test through 7 different settings ends up with these voltage values (they differ slightly between reboots for reasons unknown to me):
- no adjustment: 1.2375V - 1.2500V
over_voltage=1
: 1.2688Vover_voltage=2
: 1.2938Vover_voltage=3
: 1.3188Vover_voltage=4
: 1.3438Vover_voltage=5
: 1.3625Vover_voltage=6
: 1.3875V - 1.3938V
With the Buster image this only affects situations where the CPU is rather utilised and clockspeeds have been ramped up by the cpufreq driver while idle mode remains unaffected (1.2V at 600 MHz). But with the Bullseye image + overvolting the ARM cores are fried all the time and even when idling at 600 MHz:
Anyway: with a slight overclock to 1.2 GHz the Zero 2 performs almost at the same level than the unfortunate RPi 3B+ at 1.4 GHz. Partially related to memory performance that improved with Zero 2 (or maybe with all RPi models due to firmware/ThreadX optimisations within the last years? It's always somewhat stupid to compare benchmark results made in different years):
RPi 3B+ (old measurements with firmware/ThreadX version from 'Jun 7 2018'):
standard memcpy : 1050.1 MB/s
standard memset : 1505.7 MB/s (0.2%)
block size : single random read / dual random read
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 5.4 ns / 9.2 ns
131072 : 8.2 ns / 13.1 ns
262144 : 9.7 ns / 14.8 ns
524288 : 16.6 ns / 26.3 ns
1048576 : 91.3 ns / 142.7 ns
2097152 : 131.6 ns / 183.6 ns
4194304 : 158.0 ns / 204.8 ns
8388608 : 171.7 ns / 214.3 ns
16777216 : 180.3 ns / 220.5 ns
33554432 : 185.5 ns / 224.7 ns
67108864 : 188.5 ns / 227.2 ns
Compared to Zero 2 W with recent software: lower latency and much better memcopy score.
standard memcpy : 1295.5 MB/s (2.6%)
standard memset : 1570.9 MB/s (0.9%)
block size : single random read / dual random read
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 6.4 ns / 11.0 ns
131072 : 9.9 ns / 15.7 ns
262144 : 11.6 ns / 17.7 ns
524288 : 12.9 ns / 19.5 ns
1048576 : 81.1 ns / 125.8 ns
2097152 : 121.8 ns / 165.4 ns
4194304 : 145.0 ns / 185.8 ns
8388608 : 157.4 ns / 195.6 ns
16777216 : 165.6 ns / 201.2 ns
33554432 : 171.2 ns / 206.2 ns
67108864 : 175.3 ns / 209.3 ns
BTW: All power measurements above made with a Netio PowerBOX 4K which means 'wall wart included' and also all losses in the Micro USB cable. That's in contrast to most other consumption numbers on the net that are made with USB powermeters attached directly to the board. Jeff Geerling measured 620 mW idle consumption, my number is ~750 mW with active Wi-Fi connection but disabled HDMI (~770 mW with USB gadget mode and network on USB OTG port). In my numbers included are a 7 years old USB charger and an AWG20 rated 1.8m Micro USB cable.
The ARM cores in the RP3A0-AU SiP are Cortex-A53 (64-bit capable ARMv8 designs) while 'Raspberry Pi OS' is still built for 1st Gen RPi models like the original Zero (ARM11/ARMv6). Usually it's a good idea to build software with available CPU features enabled.
As an example: when running a 64-bit/ARMv8 userland on any of the ARMv8 RPi (3B/3B+/4B/Zero2) then the infamous 'sysbench cpu' benchmark reports scores at least 15 times better. That's probably the main reason why RPi fanboys want 64-bit asides other 'benchmarking gone wrong' adventures like the Phoronix test suite.
Another reason why running an arm64
userland is desired is massively improved AES crypto performance since almost all 64-bit ARM CPU designs licensed 'ARMv8 Crypto Extensions'. Only known exceptions: Amlogic S905 as used on ODROID-C2 / NanoPi K2 and every SoC/SiP RPi Trading Ltd. ever used (and they can't do anything about it since they have to take what Broadcom provides). AES performance of any RPi simply sucks compared to almost every other modern ARM SoC out there (check column 7 in sbc-bench results list).
That's an el cheapo RPi Zero 2 competitor called Radxa Zero with a quad core Cortex-A53 and ARMv8 Crypto Extensions:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 140278.26k 374453.87k 634580.22k 780679.17k 836960.26k 840592.04k
That's a RPi 4 (lacking ARMv8 Crypto Extensions) at 1.8GHz achieving not even 10% of the above performance when looking at relevant chunk sizes:
aes-256-cbc 60956.92k 72050.71k 76405.85k 77508.95k 77957.80k 77824.00k
And when switching to a 64-bit userland on RPi 4 it gets even worse:
aes-256-cbc 33526.93k 35271.89k 36004.01k 36201.13k 36263.25k 36257.79k
Now we're at less than 5% of the performance of another ARM SoC where the manufacturer spent the few cents (per chip) to license ARMv8 Crypto Extensions.
But ruined AES crypto performance is not the only reason why a 64-bit userland sucks on RPi Zero 2. The device has only 512 MB RAM that is shared between the primary OS (ThreadX) and any secondary OS like Linux. Processes/services built for 64-bit have a much much larger memory footprint compared to the standard Raspberry Pi OS (which is not just 32-bit but specifically built for ARMv6! Please keep this in mind when you read somewhere on the Internet about '32-bit vs. 64-bit' and folks run their comparisons on an RPi).
As a rule of thumb any process needs almost twice as much memory in 64-bit mode compared to 32-bit. You can see your Zero 2 swapping all day long running a 64-bit userland while everything runs off RAM smoothly with 32-bit. Your 64-bit apps are handled by the oom-killer
(a process killing other processes that need 'too much RAM') while they will happily do what they should when built for 32-bit. If you build a cluster out of Raspberries and your processes are memory constrained, then you'll need almost twice as much cluster nodes (RPi thingies to be bought and powered) when running 64-bit compared to standard userland.
And no, 64-bit is not faster in general. Mostly some weird benchmarks show higher scores (sysbench or Phoronix stuff that benefits from totally different CFLAGS
here or there).
Why no 64-bit numbers from Zero 2? Since a waste of time and numbers already exist (see RPi 4 scores above from sbc-bench results list or this Github issue showing the horrible memory requirements when running 64-bit userland).
Ok, no 64-bit userland. What about using a 64-bit kernel? Sure, why not. Adding arm_64bit=1
to config.txt
will do the job and after a reboot 50MB RAM are missing (in other words: 10% of available RAM on RPi Zero 2). Performance will remain the same though.
For whatever reasons the guy calling himself 'the face of Armbian' started to advertise 'Armbian for Raspberry Pi 4' (also suggesting this would be a great thing). At least there are OS images lying around at https://github.com/rpardini/armbian-release/releases so let's try it out.
What is inside these images? A mixture of ThreadX blobs from RPi Trading Ltd. on the FAT partition (that's the main OS), then kernel packages from Ubuntu for the RPi family and an Armbian userland debootstrapped from Ubuntu package sources. Both kernel and userland are 64-bit and as such we already know how this excercise will end.
I had to boot this 'Jammy Armbian' image on an RPi 4 first to
- update the primary operating system (copying over the ThreadX blobs from a Rasperry Pi OS install that has been updated to latest version using
rpi-update
) - add the Zero 2
dtb
files to the FAT partition - add firmware blobs to the ext4 partition for Wi-Fi to work
- copy over kernel images and modules from the aforementioned Rasperry Pi OS install
- adjust
config.txt
to minimize ThreadX memory consumption, adjust the crazy overvolting/overclocking and also replace the memory hungry Ubuntu kernel with most recent from RPi guys (5.10.78-v8+ aarch64) - add a wireless network with
nmtui
Only afterwards the Zero 2 was happy (see at the end of armbianmonitor -u output). While for someone used to Armbian it's probably nice to have a familiar environment using this image on anything with less than 2 GB makes no sense at all since... 64-bit and therefore everything needing much much more memory compared to a 32-bit userland.
Trying sbc-bench
as an example: all 7-zip benchmarks have been killed due to out of memory: http://ix.io/3Fmf. And while accessing the CSI camera works in the meantime on 64-bit it still requires to give away 128GB RAM to ThreadX (camera details discussed later) and you end up with this little left on a Zero:
tk@zero-2:~$ free -h
total used free shared buff/cache available
Mem: 347Mi 91Mi 102Mi 1.0Mi 154Mi 246Mi
Swap: 173Mi 0B 173Mi
Given that each and every process needs almost twice as much memory compared to a 32-bit/armhf
userland there's not that much memory left for other processes. Adding to most processes performing lower on 64-bit as soon as swap/zram kicks in to fight 'out of memory' situations eveything slows down even more. As said already before: while being able to use a 64-bit userland has some advantages for certain use cases it makes not that much sense on this limited hardware with that low memory.
The Zero 2 has a really small PCB size and as such not that much heat could be transferred from the SiP through the ball grid array into a copper ground plane (that's what the RPi guys started to do on the larger boards from RPi 3B+ on). And unfortunately the SoC die is made in an ancient 40nm process that is really not power efficient by today's standards.
Applying my 'standard heatsink' with appropriate fin spacing for passive cooling (letting convection help) does not provide that much benefits as long as there's enough radiation possible: at an ambient temperature of 23°C the idle temp is just ~1.5°C lower (40.8°C vs. 39.2°C with heatsink applied). This is the board lying flat on a table without any enclosure.
When walking through all available cpufreq OPP under load with sbc-bench -p 0-3
(execute 7-Zip's internal benchmark 3 times at each clockspeed and report averages back) we see same performance but temperature differences between ~2.5°C at 600 MHz and ~5.5°C at 1000 MHz.
Situation changes when we cramp the small board in a tiny enclosure like the official one since this functions somewhat like an oven. Idle temps are now at 44°C and of course other temperatures are also higher:
Left column is board w/o heatsink on a table, middle with heatsink, right is heatsink in official enclosure:
cpufreq 7-Zip-MIPS Temperature °C
600 1828 / 1829 / 1829 50.8 / 47.2 / 56.4
700 2143 / 2145 / 2132 57.6 / 53.0 / 60.9
800 2440 / 2446 / 2427 62.8 / 58.0 / 64.5
900 2746 / 2742 / 2745 66.9 / 62.1 / 67.7
1000 3027 / 3034 / 2992 71.8 / 66.1 / 71.4
After some load peak it also takes a long time to get temps back to normal/idle in a tiny enclosure (approx. half an hour with Buster, on Bullseye with different idle VCore behaviour it will both take longer and idle temps will be higher too):
With a small load peak we've seen temperatures going up above 70°C to decline afterwards slowly. But what about constant full load? With heatsink but inside the tiny enclosure it will throttle for sure. Of course Linux won't tell you and you need to query the primary operating system via vcgencmd
(which is what sbc-bench -m
is doing on Raspberries):
root@raspberrypi:~# sbc-bench -m
Time fake/real load %cpu %sys %usr %nice %io %irq Temp VCore
14:20:27: 1000/1000MHz 2.98 62% 1% 60% 0% 0% 0% 79.5°C 1.2375V
14:20:33: 1000/ 941MHz 3.06 79% 2% 76% 0% 0% 0% 80.6°C 1.2375V
14:20:38: 1000/ 941MHz 3.14 87% 1% 85% 0% 0% 0% 80.6°C 1.2375V
14:20:43: 1000/ 941MHz 3.04 75% 2% 71% 0% 0% 0% 81.7°C 1.2375V
14:20:48: 1000/ 834MHz 3.20 75% 2% 73% 0% 0% 0% 81.7°C 1.2375V
14:20:53: 1000/1000MHz 3.27 71% 1% 69% 0% 0% 0% 79.5°C 1.2375V
14:20:59: 1000/1000MHz 3.24 49% 1% 47% 0% 0% 0% 79.5°C 1.2375V
14:21:04: 1000/ 887MHz 3.14 80% 2% 77% 0% 0% 0% 80.6°C 1.2375V
14:21:09: 1000/ 941MHz 3.21 88% 1% 86% 0% 0% 0% 81.1°C 1.2375V
14:21:14: 1000/ 941MHz 3.28 78% 3% 74% 0% 0% 0% 80.6°C 1.2375V
14:21:19: 1000/ 834MHz 3.41 78% 1% 77% 0% 0% 0% 80.6°C 1.2375V
14:21:25: 1000/ 995MHz 3.46 72% 1% 70% 0% 0% 0% 79.5°C 1.2375V
14:21:30: 1000/ 941MHz 3.34 72% 2% 69% 0% 0% 0% 81.1°C 1.2375V
14:21:35: 1000/ 834MHz 3.40 89% 1% 87% 0% 0% 0% 81.7°C 1.2375V
14:21:40: 1000/ 887MHz 3.29 74% 2% 71% 0% 0% 0% 81.7°C 1.2375V
14:21:45: 1000/ 834MHz 3.18 83% 2% 79% 0% 0% 0% 81.7°C 1.2375V
14:21:50: 1000/ 834MHz 3.09 86% 1% 84% 0% 0% 0% 81.1°C 1.2375V
14:21:55: 1000/ 834MHz 2.92 59% 3% 56% 0% 0% 0% 81.7°C 1.2375V
14:22:01: 1000/ 834MHz 3.01 85% 1% 84% 0% 0% 0% 81.7°C 1.2375V
14:22:06: 1000/ 834MHz 3.09 78% 2% 75% 0% 0% 0% 81.7°C 1.2375V
14:22:11: 1000/ 834MHz 3.32 79% 2% 77% 0% 0% 0% 81.7°C 1.2375V
With full load it takes approximately ~15 min. for the board to reach the thermal threshold of 80°C, then hidden throttling kicks in and maximum performance drops linearly with cpufreq. Even with a heatsink cramping this little thing in a tiny enclosure without any airflow is not the best idea when you want to operate it full load constantly. You will experience throttling for sure (and in parallel a drop in consumption – see Performance and consumption above for what to expect at which clockspeed).
The SD card interface is SDXC compliant and can as such cope with SD cards up to 2TB once available. Unfortunately a voltage switch from 3.3V to 1.8V has not been implemented so SD card access is limited to High Speed (HS) mode. Quick test via iozone -e -I -a -s 100M -r 4k -r 16k -r 512k -r 1024k -r 16384k -i 0 -i 1 -i 2
on a 64GB SanDisk Extreme Pro A2:
random random
kB reclen write rewrite read reread read write
102400 4 1864 1901 6688 6710 5331 2962
102400 16 5848 5961 14174 14197 14159 6833
102400 512 20194 20340 22777 22876 22870 18953
102400 1024 20388 20124 22847 22900 22904 18276
102400 16384 20351 19905 23036 23037 23036 19711
Sequential reads/writes max out at 23/20 MB/s, random IO performance courtesy of a more expensive A2 rated card made for the use case. Same SD card with same installation in a RPi 4 capable of SDR50 shows better sequential and slightly better random IO performance:
random random
kB reclen write rewrite read reread read write
102400 4 1932 1879 8612 8634 6572 3110
102400 16 6860 6958 21676 21697 21637 6678
102400 512 33134 32580 42697 42699 42688 18364
102400 1024 32545 33856 43555 43554 43536 32343
102400 16384 32885 34067 44351 44351 44350 33558
Not able to test since living in an urban are with lots of neighbours (+250 wireless networks spottable)
Since the chip has no (RG)MII interface exposed our only options are SPI (horribly low performance) or USB2. When choosing an USB NIC it's important to get one with good features and driver support so the only real choice for Gigabit Ethernet today is a dongle with an RTL8153B inside.
Adding the dongle to the board with a short network cable and an established GbE link to an EEE enabled switch port nearby adds 950 mW to the board's consumption. This number will vary of course if a different dongle is used, a longer network cable or a switch port not supporting EEE.
Quick test using iperf3
using a direct connection (important to get link local addresses so measurements will not be negatively affected by a network stack trying to route packets through Wi-Fi since both devices are connected to the same wireless network) between Zero 2 and MacBook shows the following numbers:
- Incoming: stable 338 Mbits/sec utilising
cpu0
at 55%-60% (at 1000 MHz) and ~2380mW consumption (+1620mW compared to idle w/o USB NIC) - Outgoing: stable 311 MBits/sec with a CPU utilization less than 10% (at 1000 MHz) and ~2200mW consumption (+1440mW compared to idle w/o USB NIC)
These are pretty good throughput numbers for USB2 attached GbE, at least faster than what you get with RPi 3B+ and its crippled network chip.
pi@raspberrypi:~ $ iperf3 -c mac-tk.local ; iperf3 -R -c mac-tk.local
Connecting to host mac-tk.local, port 5201
[ 5] local 169.254.53.178 port 44194 connected to 169.254.175.23 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 37.5 MBytes 314 Mbits/sec 0 174 KBytes
[ 5] 1.00-2.00 sec 37.0 MBytes 310 Mbits/sec 0 174 KBytes
[ 5] 2.00-3.00 sec 37.0 MBytes 311 Mbits/sec 0 174 KBytes
[ 5] 3.00-4.00 sec 37.0 MBytes 311 Mbits/sec 0 187 KBytes
[ 5] 4.00-5.00 sec 37.2 MBytes 312 Mbits/sec 0 187 KBytes
[ 5] 5.00-6.00 sec 37.0 MBytes 310 Mbits/sec 0 187 KBytes
[ 5] 6.00-7.00 sec 37.2 MBytes 312 Mbits/sec 0 187 KBytes
[ 5] 7.00-8.00 sec 37.0 MBytes 310 Mbits/sec 0 187 KBytes
[ 5] 8.00-9.00 sec 37.0 MBytes 310 Mbits/sec 0 187 KBytes
[ 5] 9.00-10.00 sec 37.1 MBytes 311 Mbits/sec 0 187 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 371 MBytes 311 Mbits/sec 0 sender
[ 5] 0.00-10.01 sec 371 MBytes 311 Mbits/sec receiver
iperf Done.
Connecting to host mac-tk.local, port 5201
Reverse mode, remote host mac-tk.local is sending
[ 5] local 169.254.53.178 port 44200 connected to 169.254.175.23 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 40.4 MBytes 338 Mbits/sec
[ 5] 1.00-2.00 sec 40.3 MBytes 338 Mbits/sec
[ 5] 2.00-3.00 sec 40.3 MBytes 338 Mbits/sec
[ 5] 3.00-4.00 sec 40.4 MBytes 339 Mbits/sec
[ 5] 4.00-5.00 sec 40.3 MBytes 338 Mbits/sec
[ 5] 5.00-6.00 sec 40.3 MBytes 338 Mbits/sec
[ 5] 6.00-7.00 sec 40.1 MBytes 336 Mbits/sec
[ 5] 7.00-8.00 sec 40.3 MBytes 338 Mbits/sec
[ 5] 8.00-9.00 sec 40.3 MBytes 338 Mbits/sec
[ 5] 9.00-10.00 sec 40.0 MBytes 336 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.01 sec 404 MBytes 339 Mbits/sec sender
[ 5] 0.00-10.00 sec 403 MBytes 338 Mbits/sec receiver
iperf Done.
Repeating the measurement after locking down CPU cores to 600 MHz ends up with 328 Mbits/sec incoming (maxing out one CPU core) and 305 Mbits/sec outgoing (CPU utilization less than 15%). I did not manage to move USB interrupts away from cpu0
so if you plan on running the Zero 2 with GbE you might want to look into cgroups
and/or taskset
moving your application processes to cpu1
-cpu3
to not interfere with IRQ processing on the first ARM core.
Speaking about USB... we've already talked about using the OTG port in USB gadget mode as 'network adapter' directly connecting the RPi to a computer's USB port. While this is nice for initial setup this can also be normal mode of operation. Testing this for performance on my MacBook ends up with ok-ish throughput numbers:
- Incoming: 220 Mbits/sec utilising
cpu0
at 15%-20% (at 1000 MHz) - Outgoing: 155 MBits/sec with a CPU utilization less than 5% (at 1000 MHz)
Sorry, no consumption numbers for this mode available (yet) since I can not measure the power provided by my Mac's USB ports. 2nd test against a Linux machine (RPi 4) showed different throughput numbers:
- Incoming: 185 Mbits/sec utilising
cpu0
at 15%-20% (at 1000 MHz) - Outgoing: 260 MBits/sec with a CPU utilization at ~5% (at 1000 MHz)
(when locking Zero's cores to 600 MHz the throughput in TX direction dropped from 260 Mbits/sec to 208 while RX remained the same)
So obviously driver support at the other end of the USB cable matters but at least 150 Mbit/sec should always be possible with Zero 2 regardless of the OS connected to and its own CPU clockspeeds.
Since RPi 4 powered the Zero in this setup I also recorded total consumption of both in idle and while performing the iperf3 tasks (15 minutes in each direction). Difference between idle and saturated network link was slightly less than 1000mW and I assume that majority of this consumption difference happened at the Zero's side. RPi 4 has an awful high basic consumption but the A72 cores made in a newer process are way more efficient than the A53 on the Zero. At least the exercise demonstrates that USB gadget mode can result in ok-ish throughput numbers at a significantly lower consumption compared to USB attached Gigabit Ethernet.
"Enabling" a connected camera can be done using raspi-config
or directly by adding the following to /boot/config.txt
:
start_x=1
gpu_mem=128
All this does after the mandatory reboot is instructing the main operating system (a RTOS called ThreadX running on the VideoCore CPU) to reserve more memory for itself and start the ThreadX routines dealing with the camera. Of course this memory is now missing at the guest OS:
root@raspberrypi:~# free -h
total used free shared buff/cache available
Mem: 364Mi 54Mi 114Mi 4.0Mi 195Mi 256Mi
Swap: 849Mi 0B 849Mi
Quick check with a half-sized videostream to be sent wirelessly via netcat
to another host where the stream is stored and transcoded at the same time for live view:
raspivid -ih -b 4000000 -t 3600000 -fps 24 -w 960 -h 540 -o - | nc -k -l 2222
The ARM cores and the guest OS are almost unaffected (cores remain at 600 MHz and average load jumps from below 0.02 to 0.15) while the VideoCore has been instructed by raspivid
to do the heavy work. Consumption increases by 730mW and SoC temperature by 7.5°C with this task.
Since overall consumption in this mode is below 1.5W it's also perfectly fine to skip Wi-Fi and operate the RPi on a computer's USB port after locking the CPU cores to 600 MHz since this ensures that even if the ARM cores are fully busy total consumption will not exceed the 2.5W an USB port has to provide. In this mode the Zero uses USB gadget mode and transfers the data directly over the USB wire.
Speaking of USB wires: the main problem with Micro USB cables is not amperage but voltage drops with higher loads due to cable and contact resistance being way too high. Majority of Micro USB cables is crap and not meant to power anything that needs more than a few mW. You get either 5V at the device end of the cable or 1A but not both at the same time. Only do this if you're sure your cable is at least AWG22 rated since otherwise the RPi slows down or even freezes/crashes (more on this here and there).
New with the Zero 2 is a Data Matrix code directly printed on the PCB back between the 2 Micro USB sockets containing some information (according to some sources also the SiP's serial number):
In the picture above on the right there's the information silkscreened + data matrix codes from four different Zero 2 W (three early boards for reviewers/distributors, the upper one from a later production batch). The contents of the code as follows (decimal / hex):
0000911033950848 / 000000D41DDAE280 (bought Oct 2021)
0000011033654391 / 0000000291A83477 (review pi3g.com)
0000011033654550 / 0000000291A83516 (review hackster.io)
0000011033654544 / 0000000291A83510 (review cnx-software.com)
According to pi3g.com this is a feature for industrial customers and it's also hinting the board's serial now being hopefully unique. In the past this number (from which the network MAC addresses are/were generated) was a random number burnt to the OTP memory at the factory while QA testing which led to collisions in the wild.
If the contents of the code contain the serial number then there's additional info too since my Zero's serial number according to the OTP memory (One-Time Programmable) is different:
pi@raspberrypi:~ $ vcgencmd otp_dump ... 27:00001f1f 28:037bd9b5 29:fc84264a ... pi@raspberrypi:~ $ tail -n 4 /proc/cpuinfo Hardware : BCM2835 Revision : 902120 Serial : 00000000037bd9b5 Model : Raspberry Pi Zero 2 Rev 1.0
The single green LED on the board is not a power led but controlled by software. So unless there's a bootable OS on the SD card the led will remain off even if the device is correctly powered.
Once the ARM cores are brought up and control is partially handed over to Linux with default RPi kernel the led behavior is default-on
(check with sudo modprobe configs ; zgrep CONFIG_LEDS_TRIGGER_DEFAULT_ON /proc/config.gz
). Of course this can be changed via sysfs and for a list of possible modes you can query cat /sys/devices/platform/leds/leds/led0/trigger
(current value in brackets). Other modes can be set like this:
echo heartbeat >/sys/devices/platform/leds/leds/led0/trigger
echo mmc0 >/sys/devices/platform/leds/leds/led0/trigger
The first indicates 'OS/board still alive' by blinking every second and the 2nd signals access to SD card – see the chapter below why this might be of interest to you.
If you love your SD card then Raspberry Pi OS defaults are not for you: swap on SD card, default ext4 commit interval and logging to card.
Quick check with a default install (using the functionality I added to armbianmonitor years ago to spot just this: continuous and damaging small writes to flash media):
root@raspberrypi:~# armbianmonitor -d mmcblk0p2
Sun Nov 14 14:43:14 GMT 2021 2/40 pages written after 5 sec
Sun Nov 14 14:43:24 GMT 2021 2/40 pages written after 5 sec
Sun Nov 14 14:43:25 GMT 2021 8/36 pages written after 1 sec
Sun Nov 14 14:43:36 GMT 2021 2/40 pages written after 7 sec
Sun Nov 14 14:43:41 GMT 2021 1/4 pages written after 1 sec
Sun Nov 14 14:43:45 GMT 2021 2/60 pages written after 4 sec
Sun Nov 14 14:43:55 GMT 2021 2/44 pages written after 5 sec
Sun Nov 14 14:44:06 GMT 2021 2/40 pages written after 6 sec
Every few seconds a few bytes are written to SD card. This write pattern results in high Write Amplification and the SD card will die way earlier than necessary.
First step is to change ext4 commit interval from default (5 seconds) to 10 minutes:
sed -i -e 's/defaults,noatime/defaults,noatime,commit=600,errors=remount-ro/' /etc/fstab
Then https://github.com/ecdye/zram-config to the rescue. Simply follow the few install steps and remain with the config defaults for now. Afterwards
sudo apt purge zram-tools # only of you installed it before as suggested above
sudo systemctl disable dphys-swapfile # deactivates swap on SD card
sudo reboot
Now there's a compressed zram device /dev/zram0
for swap and the system logs with an overlayfs to /dev/zram1
and not to SD card any more:
NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram1 lzo-rle 150M 17.2M 353.3K 756K 4 /opt/zram/zram1
/dev/zram0 lzo-rle 750M 4K 87B 12K 4 [SWAP]
Now 'enjoy the silence' on SD card:
root@raspberrypi:/home/pi# date
Sun 14 Nov 15:09:28 GMT 2021
root@raspberrypi:/home/pi# armbianmonitor -d mmcblk0p2
Sun Nov 14 15:15:48 GMT 2021 2/196 pages written after 378 sec
Sun Nov 14 15:16:18 GMT 2021 33/152 pages written after 30 sec
Sun Nov 14 15:17:01 GMT 2021 1/60 pages written after 43 sec
Sun Nov 14 15:20:47 GMT 2021 1/12 pages written after 226 sec
Sun Nov 14 15:21:03 GMT 2021 2/16 pages written after 16 sec
Sun Nov 14 15:21:04 GMT 2021 2/8 pages written after 1 sec
Sun Nov 14 15:21:10 GMT 2021 4/36 pages written after 6 sec
Sun Nov 14 15:21:15 GMT 2021 3/20 pages written after 5 sec
^C
root@raspberrypi:/home/pi# date
Sun 14 Nov 15:21:35 GMT 2021
Before: 8 times within 60 seconds a few bytes were written to the card, now it took 12 minutes for 8 write attempts using larger data chunks. Write Amplification significantly decreased.
In case you're logging tons of data consider editing /etc/ztab
to switch compression algo for the log partition to zstd
(see documentation therein for details). And of course you'll loose data in case the Zero doesn't shutdown/reboot properly. The higher commit interval of 10 minutes will almost ensure that power loss == general data loss and when no proper shutdown happens then syncing back the log contents from zram to SD card also won't happen.
So if you're into unstable operation (crappy powering and the like) better live with a shorter SD card lifespan and skip these optimisations.