Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce OpTEE complete solution for RISC-V and looking for how to contribute it #6173

Closed
fanghuaqi opened this issue Jul 13, 2023 · 35 comments
Labels

Comments

@fanghuaqi
Copy link

fanghuaqi commented Jul 13, 2023

Hi guys,

We are working a complete OpTEE solution for RISC-V, and now are able to boot opensbi + optee + uboot + linux, and able to pass all the optee test suites and benchmark(except network related ones, due to lack of network interface).

Here is our design guide and user guide located here:

We mainly made the following changes regarding to optee-os, opensbi, optee driver for linux:

Could you take a look at this solution, and see what is left for us to do, and if we want to co-operate our work to upstream, what can we contribute to, and this also involve with opensbi changes, maybe opensbi contribution are also required.

We previously also searched across the web, but not yet find complete source code to try full optee solution on RISC-V, so we start to do it based on some previous work on OpTEE upstream work and keystone TEE project.

This work is mainly done by @matthewgui , and currently we still not solving this issue #6075, but we are able to run in SMP mode when in TEE world REE interrupts are disabled, and in REE world TEE interrupts can be enabled, this is detailed in the guide.

Hope you can give some feedbacks on this solution, @gagachang @maroueneboubakri @liushiwei007

Thanks

@gagachang
Copy link
Contributor

Thanks!

 opensbi contribution are also required

I think this is the most critical part.
Obviously the OP-TEE SBI extension should be standardized via Platform Runtime Services(PRS) TG or Security HC.
I remembered @liushiwei007 submitted a PR to openSBI, but it did not catch committee's eyes.

@fanghuaqi
Copy link
Author

I found the related issues about opensbi tee support list below:

@matthewgui could check our changes and see whether our opensbi changes matched with issue and pr above, if not, maybe we can discuss changes in the related issues.

@gagachang
Copy link
Contributor

For OP-TEE OS
AFAIK @maroueneboubakri is working on virtual memory management and PLIC driver.
I am looking at recently supported PAN feature.
Maybe you can start from SMP and floating point features.

@fanghuaqi
Copy link
Author

Hi @gagachang & @maroueneboubakri , how do you run this optee-os standalone project, is there any guidance about how to use it, so @matthewgui can do some contribution, and we can also check how we can integrate with current optee-os riscv support.

@gagachang
Copy link
Contributor

Do you mean how to submit a pull request?

@fanghuaqi
Copy link
Author

Do you mean how to submit a pull request?

No, I mean if we added our code, and we need to run some test suite to check our code, what is the current workflow to achive it.

Or we just checkin the code, and let other people to review the code, without running any test suites?

@gagachang
Copy link
Contributor

I always run OP-TEE xtest before submitting the code.

And ensure that new code will not affect ARM side.
OP-TEE built-in CI can help this. It will build and test other platforms based on new code.
The CI will run automatically once the new commits are pushed to optee-os.
Always submit code after CI is passed

@fanghuaqi
Copy link
Author

I always run OP-TEE xtest before submitting the code.

And ensure that new code will not affect ARM side. OP-TEE built-in CI can help this. It will build and test other platforms based on new code. The CI will run automatically once the new commits are pushed to optee-os. Always submit code after CI is passed

Hi @gagachang , do you mean you just run xtest for arm not for riscv, to make sure it compile ok for both arm and riscv, and run ok for arm?

@matthewgui
Copy link

matthewgui commented Jul 13, 2023

I found the related issues about opensbi tee support list below:

@matthewgui could check our changes and see whether our opensbi changes matched with issue and pr above, if not, maybe we can discuss changes in the related issues.

@fanghuaqi
I check riscv-non-isa/riscv-sbi-doc#106 (comment) , which mainly focus on switch context, have no interrupt processing,secondary cpu startup ,etc. maybe this is a first draft version.

@gagachang
Copy link
Contributor

Hi @gagachang , do you mean you just run xtest for arm not for riscv, to make sure it compile ok for both arm and riscv, and run ok for arm?

No~ I only manually run xtest for RISC-V.
ARM's build and xtest will be run automatically by built-in CI.

If you ever submited commits to your local optee-os repository, you might have seen the following checks:
Once all checks are passed the PR can be created to upstream optee-os.

image

@fanghuaqi
Copy link
Author

No~ I only manually run xtest for RISC-V.

So are there any steps for us to run xtest for RISC-V using this upstream optee os, can I run it without opensbi changes?

@gagachang
Copy link
Contributor

So are there any steps for us to run xtest for RISC-V using this upstream optee os, can I run it without opensbi changes?

xtest needs linux side CA and opensbi.
AFAIK there is no way to test standalone optee-os.

https://optee.readthedocs.io/en/latest/faq/faq.html#q-how-are-you-testing-op-tee

@maroueneboubakri
Copy link
Contributor

maroueneboubakri commented Jul 13, 2023

Hi @gagachang & @maroueneboubakri , how do you run this optee-os standalone project, is there any guidance about how to use it, so @matthewgui can do some contribution, and we can also check how we can integrate with current optee-os riscv support.

The easiest way would be AMP logic, you may assign one or two cores to trusted domain and run OP-TEE OS there and the remaining cores to un-trusted domain where Linux runs and set a messaging mechanism between 2 domains, we use RPMSG like mechanism.

@liushiwei007
Copy link
Contributor

I found the related issues about opensbi tee support list below:

@matthewgui could check our changes and see whether our opensbi changes matched with issue and pr above, if not, maybe we can discuss changes in the related issues.

@fanghuaqi I check riscv-non-isa/riscv-sbi-doc#106 , which mainly focus on switch context, have no interrupt processing,secondary cpu startup ,etc. maybe this is a first draft version.

This is an early version, we have now done the work of multi-core and interrupt routing, but do not know how to do upstram at the moment。

@fanghuaqi
Copy link
Author

This is an early version, we have now done the work of multi-core and interrupt routing, but do not know how to do upstram at the moment。

Hi @liushiwei007, could you update the draft version, so we can make some discussion on it, we have done the implementation on our hardware, maybe we can take the common part and finalize a workable TEE extension spec in opensbi. @matthewgui can join in the discussion.

Thanks

@liushiwei007
Copy link
Contributor

This is an early version, we have now done the work of multi-core and interrupt routing, but do not know how to do upstram at the moment。

Hi @liushiwei007, could you update the draft version, so we can make some discussion on it, we have done the implementation on our hardware, maybe we can take the common part and finalize a workable TEE extension spec in opensbi. @matthewgui can join in the discussion.

Thanks

hi, @fanghuaqi , the current version covers hardware features,we haven't started to open source hardware features yet, but I would love to discuss on it。

@gagachang
Copy link
Contributor

gagachang commented Jul 17, 2023

The easiest way would be AMP logic, you may assign one or two cores to trusted domain and run OP-TEE OS there and the remaining cores to un-trusted domain where Linux runs and set a messaging mechanism between 2 domains, we use RPMSG like mechanism.

May I know the scenario of enabling CFG_RISCV_M_MODE ?
I assume that most of deployments run optee-os in S-mode, above M-mode firmware such as openSBI.
Therefore CFG_RISCV_M_MODE is unnecessary.
I think it could reduce maintenance effort.
Is CFG_RISCV_M_MODE for AMP?

@gagachang
Copy link
Contributor

hi, @fanghuaqi , the current version covers hardware features,we haven't started to open source hardware features yet, but I would love to discuss on it。

Hello guys,
Maybe focus on software feature first?
There are so many WIP extensions such as WorldGuard, IOPMP, CoVE and Smmtt.
Your new hardware features might not be standardized at such short notice.

The software features such as SBI extension ID and function ID would be good starting point.

@fanghuaqi
Copy link
Author

The software features such as SBI extension ID and function ID would be good starting point.

I think this will be a good suggestion, at least we can have a common tee inteface in opensbi.

By the way, we recently implement required hardware features in our qemu emulator, you can also take try with it, see Nuclei-Software/nuclei-linux-sdk#13

@liushiwei007
Copy link
Contributor

hi, @fanghuaqi , the current version covers hardware features,we haven't started to open source hardware features yet, but I would love to discuss on it。

Hello guys, Maybe focus on software feature first? There are so many WIP extensions such as WorldGuard, IOPMP, CoVE and Smmtt. Your new hardware features might not be standardized at such short notice.

The software features such as SBI extension ID and function ID would be good starting point.

The patch I submitted contained this, but it didn't get any attention, and I even wondered if I could submit opensbi and linux in the OPTEE project first.

@gagachang
Copy link
Contributor

The patch I submitted contained this, but it didn't get any attention, and I even wondered if I could submit opensbi and linux in the OPTEE project first.

For linux, I think RISC-V needs another C file in optee driver.
For example, riscv_sbi.c, and copied necessary functions from smc_abi.c.
We also need to discuss the device node property:

		optee {
			compatible = "linaro,optee-tz";
			method = "smc";
		};

TZ and smc are ARM specified. Should be defined in another words.

@jforissier
Copy link
Contributor

We also need to discuss the device node property:

		optee {
			compatible = "linaro,optee-tz";
			method = "smc";
		};

TZ and smc are ARM specified. Should be defined in another words.

Parhaps we should introduce a new compatible: linaro,optee (without the -tz) in replacement of the previous one (which we would keep for compatibility of course), and add method: sbi.

@gagachang
Copy link
Contributor

Parhaps we should introduce a new compatible: linaro,optee (without the -tz) in replacement of the previous one (which we would keep for compatibility of course), and add method: sbi.

Sounds good to me!

@fanghuaqi
Copy link
Author

Parhaps we should introduce a new compatible: linaro,optee (without the -tz) in replacement of the previous one (which we would keep for compatibility of course), and add method: sbi.

It will be great to have that

@jenswi-linaro
Copy link
Contributor

Another option is to use the FF-A approach where drivers can register for an FF-A bus. That doesn't need any DT at all, except possibly for the SBI driver or framework.

@maroueneboubakri
Copy link
Contributor

Another option is to use the FF-A approach where drivers can register for an FF-A bus. That doesn't need any DT at all, except possibly for the SBI driver or framework.

+1

@github-actions
Copy link

This issue has been marked as a stale issue because it has been open (more than) 30 days with no activity. Remove the stale label or add a comment, otherwise this issue will automatically be closed in 5 days. Note, that you can always re-open a closed issue at any time.

@gagachang
Copy link
Contributor

Hello,

I am wondering if anyone here can regularly review the PRs related to RISC-V.
Every RISC-V PR needs a Reviewed-by or Acked-by tag before notifying OP-TEE maintainers.

@jenswi-linaro
Copy link
Contributor

My focus is Arm and generic OP-TEE code. I can help review an occasional RISC-V patch especially if it relates to generic OP-TEE code, but I would prefer if the RISC-V patches in general can be reviewed by people having an interest in that architecture.

Upstreaming code needs both an author and reviewers. Ironically is writing and debugging the code often the easy part. If you have an interest in RISC-V and can put some time into reviewing RISC-V patches, please help the others who are writing code for RISC-V.

@jforissier
Copy link
Contributor

I am in the same situation as Jens and totally agree with his words. It would be nice to have an entry for RISC-V in MAINTAINERS.

@gagachang
Copy link
Contributor

gagachang commented Oct 16, 2023

Hello @fanghuaqi @matthewgui

I have a problem about SMP (2 CPU) and I am wondering if you also ever encounter this problem.
I am testing OP-TEE with SMP enabled on RV64 QEMU virtual platform.
When I run optee_example_hello_world, sometimes I found that "OP-TEE invokes RPC to return to linux, but linux yields back to OPTEE too late/slow".

Did you ever encounter this problem ?
It seems when the problem happens, linux handles single RPC for about 4 seconds.
I try to figure out what happened but nothing progress.

It does not happen every time when I execute optee_example_hello_world, usually the execution is smooth.
The linux and OP-TEE do not crash, I can run optee_example_hello_world again and again even the problem occurs.

I enable trace points of RISC-V timer IRQ and OP-TEE invoke function:
I can see that when the timestamp is 3456.099215, OP-TEE invokes RPC to linux
But linux does not invoke back to OP-TEE until 3460.093577 .

 optee_example_h-337     [000] .....  3456.092741: optee_invoke_fn_begin: param=000000008ab15a0a (32000003, 0, 0, 0, 0, 0, 0, 0)
          <idle>-0       [001] dnh1.  3456.098207: irq_handler_exit: irq=10 ret=handled
 optee_example_h-337     [000] d.h..  3456.098972: irq_handler_exit: irq=10 ret=handled
 optee_example_h-337     [000] .....  3456.099215: optee_invoke_fn_end: param=000000008ab15a0a ret (ffffffffffff0005, 0, 0, 0)
          <idle>-0       [001] d.h1.  3456.204000: irq_handler_exit: irq=10 ret=handled
  tee-supplicant-187     [001] d.h1.  3456.209157: irq_handler_exit: irq=10 ret=handled
  tee-supplicant-187     [001] dnH1.  3456.209946: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3456.214314: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3456.219670: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3456.223294: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3456.227012: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3456.230152: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3456.720795: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3457.238080: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3457.772675: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3458.282821: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3458.803007: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3459.314033: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3459.824768: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3459.825802: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [000] dnh1.  3460.092547: irq_handler_exit: irq=10 ret=handled
 optee_example_h-337     [000] .....  3460.093577: optee_invoke_fn_begin: param=000000008ab15a0a (32000003, 0, 0, 0, 0, 0, 0, 0)
 optee_example_h-337     [000] d.h2.  3460.094152: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] dnh1.  3460.099831: irq_handler_exit: irq=10 ret=handled
 optee_example_h-337     [000] d.h..  3460.116754: irq_handler_exit: irq=10 ret=handled
 optee_example_h-337     [000] .....  3460.116777: optee_invoke_fn_end: param=000000008ab15a0a ret (ffffffffffff0005, 0, 0, 0)
          <idle>-0       [001] d.h1.  3460.333439: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3460.339270: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3460.343044: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3460.347259: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3460.350327: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3460.845374: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3461.355163: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3461.866594: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3462.394786: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3462.927148: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3463.439360: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [001] d.h1.  3463.948966: irq_handler_exit: irq=10 ret=handled
          <idle>-0       [000] dnh1.  3464.097466: irq_handler_exit: irq=10 ret=handled
 optee_example_h-337     [000] .....  3464.098251: optee_invoke_fn_begin: param=000000008ab15a0a (32000003, 0, 0, 0, 0, 0, 0, 0)
          <idle>-0       [001] dnh1.  3464.099969: irq_handler_exit: irq=10 ret=handled
 optee_example_h-337     [000] d.h..  3464.102304: irq_handler_exit: irq=10 ret=handled
 optee_example_h-337     [000] .....  3464.102852: optee_invoke_fn_end: param=000000008ab15a0a ret (0, 0, 0, 0)

@matthewgui
Copy link

hello @gagachang
At that debuging time, it seems we encounter linux scheduling problems, but it was related to our TEE interrupt processing。

Maybe you need to check the return value of invoke function,to get more detailed flow。I search the related issue on google,found resched issue on rpc branch,but this patch is not effect for me。
image

@gagachang
Copy link
Contributor

Hi @matthewgui
Thanks for reply!
I apply your patch but it is not effect for me too.
I think I am encountering linux scheduling problem too.
Thanks for hint, I will investigate more detail.

@gagachang
Copy link
Contributor

Hello @matthewgui

I found the root cause is that my OpenSBI save/restore CSR sip during Linux/OP-TEE context switching.
sip.STIP can be asserted during OP-TEE execution and OP-TEE yields the control to Linux to handle that timer interrupt.
However, the sip.STIP is restored to "previous value" by openSBI, so Linux does not handle that timer interrupt and lead to the scheduling problem.
I remove the save/restore logic for CSR sip in openSBI, now the problem is solved.
Thanks!

@gagachang
Copy link
Contributor

Hello @maroueneboubakri

I see there are two configurations to control OP-TEE runs on which privilege level:
CFG_RISCV_M_MODE and CFG_RISCV_S_MODE
My question is: Do we really need to run OP-TEE on M-mode?
OP-TEE uses virtual memory system to manage TAs. Therefore, OP-TEE must run on S-mode, right?

In which scenario or architecture, or advantage, that OP-TEE needs to be run as M-mode.
I ask this because I am planning to remove M-mode code and focusing on S-mode, so that we can reduce maintenance effort.

fanghuaqi added a commit to riscv-mcu/qemu that referenced this issue Oct 11, 2024
This is a patch made for optee hardware support,
see OP-TEE/optee_os#6173

Signed-off-by: Huaqi Fang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants