Skip to content

XUANTIE-RV/zero_stage_boot

Repository files navigation

Introduction

The zero stage boot is the XuanTie processor init code before opensbi. Before zero_stage_boot, SoC vendors must prepare ddr_init and CPU reset procedures. All harts would get into zero_stage_boot together, and the first one would duty to relocate GOT & offset variable, and others wait. Every hart would init its CSRs by their CPUID versions separately, allowing different harts to work together, e.g., 4*c908 + 2*c910. You could compile standard opensbi and Linux kernel binaries from open-source repositories, all compatible with XuanTie processors. Here is the simple boot flow:

[Jtag gdbinit] -> [zero_stage_boot] -> [opensbi] -> [Linux]

opensbi: https://github.com/riscv-software-src/opensbi
Linux:   https://kernel.org/

Compiling zero_stage_boot is very straightforward, requiring only a standard RISC-V GCC compiler:

CROSS_COMPILE=riscv64-unknown-linux-gnu- make

However, we strongly recommend using the released binaries for the FPGA bringup. Click the Releases button on the right to obtain pre-compiled binaries for zsb, OpenSBI, and Image (Linux). These binary files have undergone comprehensive testing prior to release and contain detailed and precise version information.

  • 64lp64 means running lp64 ABI on 64-bit Hardware.

  • 32ilp32 means running ilp32 ABI on 32-bit Hardware.

  • 64ilp32 means running ilp32 ABI on 64-bit Hardware.

  • zsb means zero_stage_boot.

  • Linux-5.10 + opensbi-0.9 is for early customers.

  • Linux-6.6 + opensbi-1.3 is for current.

  • zsb-64lp64-xt is simply recompiled with a custom compiler; functionally, it is identical to zsb-64lp64.

For a rv64 processor, you can download zsb-64lp64.tar.gz + opensbi-1.3-64lp64.tar.gz + linux-6.6-64lp64.tar.gz and prepare your own DTS + gdbinit.

Then, you can use Jtag to run FPGA Platform.

linux-6.6 opensbi-1.3 DTS Example

The XuanTie C9xx DTB provided to OpenSBI generic firmware will usually have "thead,c900-clint", "thead,c900-plic", compatible strings.

/dts-v1/;
/ {
	model = "Test Sample";
	compatible = "test,sample";
	#address-cells = <2>;
	#size-cells = <2>;

	memory@60000000 {
		device_type = "memory";
                Caution: Determine your own address here
		reg = <0x0 0x60000000 0x0 0x40000000>;
	};

	cpus {
		#address-cells = <1>;
		#size-cells = <0>;
		timebase-frequency = <25000000>;
		cpu@0 {
			device_type = "cpu";
			reg = <0>;
			status = "okay";
			compatible = "riscv";
			riscv,isa = "rv64imafdc_zicbom_svpbmt_sstc_sscofpmf";
			riscv,cbom-block-size = <64>;
			mmu-type = "riscv,sv57";
			cpu0_intc: interrupt-controller {
				#address-cells = <0>;
				#interrupt-cells = <1>;
				compatible = "riscv,cpu-intc";
				interrupt-controller;
			};
		};
		cpu@1 {
			device_type = "cpu";
			reg = <1>;
			status = "okay";
			compatible = "riscv";
			riscv,isa = "rv64imafdc_zicbom_svpbmt_sstc_sscofpmf";
			riscv,cbom-block-size = <64>;
			mmu-type = "riscv,sv57";
			cpu1_intc: interrupt-controller {
				#address-cells = <0>;
				#interrupt-cells = <1>;
				compatible = "riscv,cpu-intc";
				interrupt-controller;
			};
		};
		cpu@2 {
			device_type = "cpu";
			reg = <2>;
			status = "okay";
			compatible = "riscv";
			riscv,isa = "rv64imafdc_zicbom_svpbmt_sstc_sscofpmf";
			riscv,cbom-block-size = <64>;
			mmu-type = "riscv,sv57";
			cpu2_intc: interrupt-controller {
				#address-cells = <0>;
				#interrupt-cells = <1>;
				compatible = "riscv,cpu-intc";
				interrupt-controller;
			};
		};
		cpu@3 {
			device_type = "cpu";
			reg = <3>;
			status = "okay";
			compatible = "riscv";
			riscv,isa = "rv64imafdc_zicbom_svpbmt_sstc_sscofpmf";
			riscv,cbom-block-size = <64>;
			mmu-type = "riscv,sv57";
			cpu3_intc: interrupt-controller {
				#address-cells = <0>;
				#interrupt-cells = <1>;
				compatible = "riscv,cpu-intc";
				interrupt-controller;
			};
		};
	};


	soc {
		#address-cells = <2>;
		#size-cells = <2>;
		compatible = "simple-bus";
		dma-noncoherent;
		ranges;

		clint0: clint@c000000 {
			compatible = "thead,c900-clint";
			interrupts-extended = <
				&cpu0_intc  3 &cpu0_intc  7
				&cpu1_intc  3 &cpu1_intc  7
				&cpu2_intc  3 &cpu2_intc  7
				&cpu3_intc  3 &cpu3_intc  7
				>;
			reg = <0x0 0x0c000000 0x0 0x04000000>;
                Caution: Determine your own address here
			clint,has-no-64bit-mmio;
		};

		intc: interrupt-controller@8000000 {
			#address-cells = <0>;
			#interrupt-cells = <2>;
			compatible = "thead,c900-plic";
			reg = <0x0 0x08000000 0x0 0x04000000>;
                Caution: Determine your own address here
			riscv,ndev = <64>;
			interrupt-controller;
			interrupts-extended = <
				&cpu0_intc  0xffffffff &cpu0_intc  9
				&cpu1_intc  0xffffffff &cpu1_intc  9
				&cpu2_intc  0xffffffff &cpu2_intc  9
				&cpu3_intc  0xffffffff &cpu3_intc  9
				>;
		};
	};
};

linux-5.10 opensbi-0.9 DTS Example

The XuanTie C9xx DTB provided to OpenSBI generic firmware will usually have "riscv,clint0", "riscv,plic0", compatible strings.

/dts-v1/;
/ {
	model = "Test Sample";
	compatible = "test,sample";
	#address-cells = <2>;
	#size-cells = <2>;

	memory@60000000 {
		device_type = "memory";
                Caution: Determine your own address here
		reg = <0x0 0x60000000 0x0 0x40000000>;
	};

	cpus {
		#address-cells = <1>;
		#size-cells = <0>;
		timebase-frequency = <25000000>;
		cpu@0 {
			device_type = "cpu";
			reg = <0>;
			status = "okay";
			compatible = "riscv";
			riscv,isa = "rv64ima";
			mmu-type = "riscv,sv39";
			cpu0_intc: interrupt-controller {
				#address-cells = <0>;
				#interrupt-cells = <1>;
				compatible = "riscv,cpu-intc";
				interrupt-controller;
			};
		};
		cpu@1 {
			device_type = "cpu";
			reg = <1>;
			status = "okay";
			compatible = "riscv";
			riscv,isa = "rv64ima";
			mmu-type = "riscv,sv39";
			cpu1_intc: interrupt-controller {
				#address-cells = <0>;
				#interrupt-cells = <1>;
				compatible = "riscv,cpu-intc";
				interrupt-controller;
			};
		};
		cpu@2 {
			device_type = "cpu";
			reg = <2>;
			status = "okay";
			compatible = "riscv";
			riscv,isa = "rv64ima";
			mmu-type = "riscv,sv39";
			cpu2_intc: interrupt-controller {
				#address-cells = <0>;
				#interrupt-cells = <1>;
				compatible = "riscv,cpu-intc";
				interrupt-controller;
			};
		};
		cpu@3 {
			device_type = "cpu";
			reg = <3>;
			status = "okay";
			compatible = "riscv";
			riscv,isa = "rv64ima";
			mmu-type = "riscv,sv39";
			cpu3_intc: interrupt-controller {
				#address-cells = <0>;
				#interrupt-cells = <1>;
				compatible = "riscv,cpu-intc";
				interrupt-controller;
			};
		};
	};


	soc {
		#address-cells = <2>;
		#size-cells = <2>;
		compatible = "simple-bus";
		ranges;

		clint0: clint@c000000 {
			compatible = "riscv,clint0";
			interrupts-extended = <
				&cpu0_intc  3 &cpu0_intc  7
				&cpu1_intc  3 &cpu1_intc  7
				&cpu2_intc  3 &cpu2_intc  7
				&cpu3_intc  3 &cpu3_intc  7
				>;
			reg = <0x0 0x0c000000 0x0 0x04000000>;
                Caution: Determine your own address here
			clint,has-no-64bit-mmio;
		};

		intc: interrupt-controller@8000000 {
			#address-cells = <0>;
			#interrupt-cells = <1>;
			compatible = "riscv,plic0";
			reg = <0x0 0x08000000 0x0 0x04000000>;
                Caution: Determine your own address here
			riscv,ndev = <64>;
			interrupt-controller;
			interrupts-extended = <
				&cpu0_intc  0xffffffff &cpu0_intc  9
				&cpu1_intc  0xffffffff &cpu1_intc  9
				&cpu2_intc  0xffffffff &cpu2_intc  9
				&cpu3_intc  0xffffffff &cpu3_intc  9
				>;
		};
	};
};

CPU gdbinit script

# Set gdb environment
set confirm off
set height  0
monitor set resume-bkpt-exception on

# memory layout
set $opensbi_addr = 0x60000000
set $vmlinux_addr = $opensbi_addr + 0x00400000
set $rootfs_addr  = $opensbi_addr + 0x04000000
set $dtb_addr     = $rootfs_addr  - 0x00100000
set $zsb_addr     = $rootfs_addr  - 0x00008000
set $dyninfo_addr = $rootfs_addr  - 0x40
set $flag_addr    = $rootfs_addr  - 0x100

# Load kernel
restore zero_stage_boot.bin binary          $zsb_addr
restore <preceding dts example>.dtb binary  $dtb_addr
restore fw_dynamic.bin binary               $opensbi_addr
restore Image binary                        $vmlinux_addr

# Set opensbi dynamic info param
set *(unsigned long *)($dyninfo_addr)      = 0x4942534f
set *(unsigned long *)($dyninfo_addr + 8)  = 2
set *(unsigned long *)($dyninfo_addr + 16) = $vmlinux_addr
set *(unsigned long *)($dyninfo_addr + 24) = 1
set *(unsigned long *)($dyninfo_addr + 32) = 0
set *(unsigned long *)($dyninfo_addr + 40) = 0

# Set boot flag for CPU functional setting
# This flag.BIT[0] makes zsb enable RV64XT32 by setting mxstatus.[63]=1
# set *(unsigned int *)$flag_addr = 0x1
# This flag.BIT[1] makes zsb enable COPINSTEE by setting mxstatus.[24]=1 && mxstatus.[22]=0
# set *(unsigned int *)$flag_addr = 0x2
set *(unsigned int *)$flag_addr = 0x0

# Set fast memory base reg for c908x
# set $mtnfastmba = 0xXXXX

# PLIC delegate (Only opensbi-0.9 & Linux-5.10 need it)
set *0x081ffffc=1

# Set all harts reset address (reset controller demo according to your SoC definition)
set *0x18030010 = $zsb_addr
set *0x18030018 = $zsb_addr
set *0x18030020 = $zsb_addr
set *0x18030028 = $zsb_addr
set *0x18030030 = $zsb_addr
set $pc         = $zsb_addr

# Release all harts from reset
set *0x18030000 = 0x7f

# If you don't have a reset controller in SoC, and harts reset into bootrom's loop code.
# Then, Use below method:
# thread 1
# set $pc = $zsb_addr
# thread 2
# set $pc = $zsb_addr
# thread 3
# set $pc = $zsb_addr
# thread 4
# set $pc = $zsb_addr
# thread 5
# set $pc = $zsb_addr
# -ex "c" would let all harts jump to $zsb_addr.

Run

Start Jtag Server.

DebugServerConsole -prereset

Then use gdb connect the Jtag Server.

riscv64-elf-gdb -ex "tar remote <Jtag Server ip:port>" -x <your soc gdbinit> -x <preceding cpu gdbinit> -ex "c"

Use ctrl+c to get into the gdb shell.

file vmlinux
source gdbmarcos.txt
dmesg

gdbmacros.txt:

vmlinux: The Linux kernel ELF file

Appendix A - PMU in DTS

The configuration of PMU can be referred to OpenSBI SBI PMU extension

The following is an example of PMU configuration for the Xuantie C-series CPU written according to the datasheet.

pmu {
	compatible = "riscv,pmu";
	riscv,event-to-mhpmevent =
		/* PMU_HW_BRANCH_INSTRUCTIONS -> inst_branch */
		<0x00005 0x00000000 0x00000036>,
		/* PMU_HW_BRANCH_MISSES -> inst_branch_mispredict */
		<0x00006 0x00000000 0x00000038>,
		/* PMU_HW_STALLED_CYCLES_FRONTEND -> ifu_stalled_cycle */
		<0x00008 0x00000000 0x00000027>,
		/* PMU_HW_STALLED_CYCLES_BACKEND -> idu_stalled_cycle */
		<0x00009 0x00000000 0x00000028>,
		/* L1D_READ_ACCESS -> l1_dcache_read_access */
		<0x10000 0x00000000 0x0000000c>,
		/* L1D_READ_MISS -> l1_dcache_read_miss */
		<0x10001 0x00000000 0x0000000d>,
		/* L1D_WRITE_ACCESS -> l1_dcache_write_access */
		<0x10002 0x00000000 0x0000000e>,
		/* L1D_WRITE_MISS -> l1_dcache_write_miss */
		<0x10003 0x00000000 0x0000000f>,
		/* L1I_READ_ACCESS -> l1_icache_access */
		<0x10008 0x00000000 0x00000001>,
		/* L1I_READ_MISS -> l1_icache_miss */
		<0x10009 0x00000000 0x00000002>,
		/* LL_READ_ACCESS -> ll_cache_read_access */
		<0x10010 0x00000000 0x00000010>,
		/* LL_READ_MISS -> ll_cache_read_miss */
		<0x10011 0x00000000 0x00000011>,
		/* LL_WRITE_ACCESS -> ll_cache_write_access */
		<0x10012 0x00000000 0x00000012>,
		/* LL_WRITE_MISS -> ll_cache_write_miss */
		<0x10013 0x00000000 0x00000013>,
		/* BPU_READ_ACCESS -> branch_direction_prediction */
		<0x10028 0x00000000 0x0000001c>,
		/* BPU_READ_MISS -> branch_direction_misprediction */
		<0x10029 0x00000000 0x0000001b>;
	riscv,event-to-mhpmcounters =
		/* The Xuantie processor only implements 18 mhpmcounters, so the bitmap is 0x7fff8 */
		<0x00005 0x00005 0x7fff8>,
		<0x00006 0x00006 0x7fff8>,
		<0x00008 0x00008 0x7fff8>,
		<0x00009 0x00009 0x7fff8>,
		<0x10000 0x10000 0x7fff8>,
		<0x10001 0x10001 0x7fff8>,
		<0x10002 0x10002 0x7fff8>,
		<0x10003 0x10003 0x7fff8>,
		<0x10008 0x10008 0x7fff8>,
		<0x10009 0x10009 0x7fff8>,
		<0x10010 0x10010 0x7fff8>,
		<0x10011 0x10011 0x7fff8>,
		<0x10012 0x10012 0x7fff8>,
		<0x10013 0x10013 0x7fff8>,
		<0x10028 0x10028 0x7fff8>,
		<0x10029 0x10029 0x7fff8>;
	riscv,raw-event-to-mhpmcounters =
		/* For raw event ID 0x0 - 0xff */
		<0x0 0x0 0xffffffff 0xffffff00 0x7fff8>;
};

For example, using perf stat & perf record:

# perf stat ls

 Performance counter stats for 'ls':

             74.05 msec task-clock                       #    0.747 CPUs utilized
                 0      context-switches                 #    0.000 /sec
                 0      cpu-migrations                   #    0.000 /sec
                58      page-faults                      #  783.256 /sec
           3689065      cycles                           #    0.050 GHz
           1336494      instructions                     #    0.36  insn per cycle
            162119      branches                         #    2.189 M/sec
             28716      branch-misses                    #   17.71% of all branches

       0.099143960 seconds time elapsed

       0.016153000 seconds user
       0.092880000 seconds sys
# echo 1000 > /proc/sys/kernel/perf_event_max_sample_rate
# perf record -g ls
perf.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.006 MB perf.data (9 samples) ]

Appendix B - How to compile perf

We can use buildroot to compile rootfs with perf tool.

# git clone https://github.com/buildroot/buildroot.git
# cd buildroot/
# make qemu_riscv64_virt_defconfig
# make menuconfig

Enable the following PACKAGE config in menuconfig.

BR2_PACKAGE_LINUX_TOOLS=y
BR2_PACKAGE_LINUX_TOOLS_PERF=y
BR2_PACKAGE_ELFUTILS=y

Appendix C - Additional DTS

Additional DTS examples(serial, bootargs with initrd):

serial@1900d000 {
	compatible = "snps,dw-apb-uart";
	reg = <0x0 0x1900d000 0x0 0x400>;
	interrupt-parent = <&intc>;
	interrupts = <20 4>;
	clock-frequency = <36000000>;
	clock-names = "baudclk";
	reg-shift = <2>;
	reg-io-width = <4>;
};

chosen {
	bootargs = "console=ttyS0,115200 norandmaps loglevel=7";
	linux,initrd-start = <0x0 0x64000000>;
	linux,initrd-end = <0x0 0x66000000>;
	stdout-path = "/soc/serial@1900d000:115200";
};

The 'serial' needs to be configured based on the actual configuration of 'reg', 'interrupts', 'clock-frequency', while the 'chosen' needs to be configured based on the actual configuration of 'linux,initrd-start', 'linux,initrd-end'.