Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Support for Zba, Zbb, Zbc and Zbs extensions to CVA6 #878

Merged
merged 21 commits into from
Jul 6, 2022

Conversation

M-Ijaz-10x
Copy link
Contributor

Introduction

This PR is the continuation of issue-451 and adds support for zba, zbb, zbc and zbs extensions under the wrapper of a compiler directive BITMANIP. These changes have been tested with self-written single instruction tests and using the compliance tests with the core-v-verif infrastructure.

Implementation

Added the support for all the ratified bitmanip extensions as defined under this bitmanip-spec

  • Zba extension (Address generation instructions)
  • Zbb extension (Basic bit-manipulation instructions)
  • Zbc extension (Carry-less multiplication instructions)
  • Zbs extension (Single-bit instructions)

Integration of bitmanip hardware with cva6

The major changes have been done in the decoder which includes adding the RISC-V Bitmanip opcodes, selecting the Functional Unit from Multiplier and ALU of the Ariane core and implementing design logic for Bitmanip instructions in ALU and Multiplier FUs.

Verification

Assembly Tests

The functionality of each implemented Bitmanip instruction has been verified by running individual assembly tests related to that particular instruction on the updated design.

Core-v-verif environment

The OpenHW group’s core-v-verif verification environment has been modified to support the latest riscv-arch-test infrastructure (PR: Added the support to run latest riscv-arch-tests in core-v-verif CVA6 infrastructure) and used to run the compliance tests (base tests + bitmanip tests) on the updated design. All the required bitmanip tests have successfully passed.

@M-Ijaz-10x M-Ijaz-10x requested a review from zarubaf as a code owner May 16, 2022 12:40
Copy link
Contributor

@zarubaf zarubaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is really great, thanks for the pull request! Logic looks good to me in general.

The only comments I have are around we handle the inclusion, we usually use parameters for that and try to avoid ifdefs as much as possible. So I've added some comments on how I think we can make this work using only parameters :-) Please do let me know whether there is something unclear from the comments?

Makefile Outdated Show resolved Hide resolved
core/Flist.cv64a6_imafdc_sv39 Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
core/multiplier.sv Outdated Show resolved Hide resolved
core/multiplier.sv Outdated Show resolved Hide resolved
core/multiplier.sv Outdated Show resolved Hide resolved
core/multiplier.sv Outdated Show resolved Hide resolved
core/multiplier.sv Outdated Show resolved Hide resolved
@M-Ijaz-10x
Copy link
Contributor Author

Thank you @zarubaf for the review.
I have made all the suggested changes. Please let me know if I am missing anything. Thanks :)

Copy link
Contributor

@zarubaf zarubaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terrific, thanks soo much. Only minor code formatting things and then we are good to go!

core/Flist.cv64a6_imafdc_sv39 Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
core/decoder.sv Outdated Show resolved Hide resolved
core/multiplier.sv Show resolved Hide resolved
@JeanRochCoulon
Copy link
Contributor

That's a very great contribution !! Thank for that

As I am not in office in the coming days, I will available to check the PR from June1st to check max frequency, gate count, and functional regression in 32 and 64 bits. I let you know.

@M-Ijaz-10x
Copy link
Contributor Author

That's a very great contribution !! Thank for that

As I am not in office in the coming days, I will available to check the PR from June1st to check max frequency, gate count, and functional regression in 32 and 64 bits. I let you know.

Sure. No Problem!

Copy link
Contributor

@zarubaf zarubaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything on my end is green! I will trigger the CI and @JeanRochCoulon you can merge if the PPA is okay for you! Thanks again @M-Ijaz-10x, that is a great contribution!

@M-Ijaz-10x M-Ijaz-10x closed this May 24, 2022
@M-Ijaz-10x M-Ijaz-10x reopened this May 24, 2022
@M-Ijaz-10x
Copy link
Contributor Author

Everything on my end is green! I will trigger the CI and @JeanRochCoulon you can merge if the PPA is okay for you! Thanks again @M-Ijaz-10x, that is a great contribution!

Happy to contribute. :) Looking forward to contribute more in the future developments.

core/alu.sv Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
core/alu.sv Outdated Show resolved Hide resolved
@M-Ijaz-10x
Copy link
Contributor Author

While making the suggested changes on the PR, a bug was overlooked in the design related to exceptions (illegal instruction in the decoder) which was probably causing the tests to fail (CI checks timed-out). This is fixed in the latest commit on this PR. These changes are tested by running RISC-V Compliance tests and Bitmanip tests which came out to be clean now.
@zarubaf Can you please review the new changes and trigger the CI again? Apologies for the inconvenience. Thank you.

@JeanRochCoulon
Copy link
Contributor

Hello, thanks again for this contribution. I would request some modifications to make VCS and synthesis work.

By running the Continuous Integration, I get some errors:

  • During synthesis of 32bits version, Error: ../..//core/csr_regfile.sv:574: Array index out of bounds instret[63:32], valid bounds are [31:0]. (ELAB-298)
  • During synthesis 32bits, Error: ../..//core/alu.sv:179: Array index out of bounds fu_data_i[63:0], valid bounds are [31:0]. (ELAB-298)
  • When simulating with VCS, gen_bitmanip name is used twice in multiplier.sv

In another hand, as we would be pleased to enable/disable the bitmanip support by CVA6, could you add a config variable, as done in core/include/cv32a6_XXX_config_pkg.sv to configure it? The generation of the RTL dedicated to bitmanip should be conditioned by it (decode, alu,...).

Regards

@M-Ijaz-10x
Copy link
Contributor Author

  • During synthesis of 32bits version, Error: ../..//core/csr_regfile.sv:574: Array index out of bounds instret[63:32], valid bounds are [31:0]. (ELAB-298)

Hello @JeanRochCoulon, the above error was due to the changes made in PR-847. Above commit proposed a workable solution for this error, but @spersvold will need to review the fix.

The remaining two errors have also been resolved in the latest commits.

In another hand, as we would be pleased to enable/disable the bitmanip support by CVA6, could you add a config variable, as done in core/include/cv32a6_XXX_config_pkg.sv to configure it? The generation of the RTL dedicated to bitmanip should be conditioned by it (decode, alu,...).

From the start, we are confining all the bitmanip RTL changes under the bitmanip parameter defined in ariane_pkg.sv as suggested by @zarubaf

core/alu.sv Outdated
Comment on lines 285 to 291
ROL: result_o = (fu_data_i.operand_a << fu_data_i.operand_b[5:0]) | (fu_data_i.operand_a >> (riscv::XLEN-fu_data_i.operand_b[5:0]));
ROLW: result_o = {{riscv::XLEN-32{rolw[31]}}, rolw};
ROR, RORI: result_o = (fu_data_i.operand_a >> fu_data_i.operand_b[5:0]) | (fu_data_i.operand_a << (riscv::XLEN-fu_data_i.operand_b[5:0]));
RORW, RORIW: result_o = {{riscv::XLEN-32{rorw[31]}}, rorw};
Copy link

@ZenithalHourlyRate ZenithalHourlyRate Jun 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#878 (comment)

Ahh, it seems that you did not resolve it. It turned out that I knew another technique but forget to add it in last review

  1. It is observed that rol(x) = ror(reverse(w)), you can use ror(shift_op_a) to get rol(a), just as the current design uses srl to get sll
  2. It is observed that rorw(x[31:0]) = ror({x[31:0],x[31:0]})[31:0]. If you still want to leave RORW and ROR as two separate datapaths, at least RORW and ROLW could be merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ZenithalHourlyRate for your review, really appreciate it. I will make this last change in a separate follow-up PR. For now, I am keeping it in my notes. :)

Copy link
Contributor

@spersvold spersvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me. However, I noticed that "instret" will not work as intended when in 32-bit mode if you wrap it (because the incrementing logic in lines 333-349 is only dealing with the temporary variable "instret" which is xlen_t only, and then casting to instret_d after).

This should be fixed in a different commit though, so I approve this one.

@JeanRochCoulon
Copy link
Contributor

JeanRochCoulon commented Jun 7, 2022

Hello,
I get another eror compiling with VCS:
Error-[IRIPS] Illegal range in part select core-v-verif/core-v-cores/cva6/core/alu.sv, 191
The range of the part select is illegal: fu_data_i.operand_a[(riscv::XLEN - 33):0]
As you know, XLEN can be equal to 32... In that case, XLEN-33 is negative !
Have a nice day

@JeanRochCoulon
Copy link
Contributor

I synthetized in 32 bits version, please find the error reported by dc_shell Synsopsys tool:

Error: ../..//core/alu.sv:287: Array index out of bounds fu_data_i[63:56], valid bounds are [31:0]. (ELAB-298)
*** Presto compilation terminated with 1 errors. ***
Information: Building the design 'alu'. (HDL-193)
Warning: Unable to resolve reference 'alu' in 'ex_stage_1'. (LINK-5)

Regards

@fatimasaleem fatimasaleem force-pushed the riscv-bitmanip-ext branch 2 times, most recently from 9e1c6b7 to 20a181a Compare June 28, 2022 09:45
@zchamski
Copy link
Contributor

zchamski commented Jul 1, 2022

Hi @fatimasaleem, in order to make the validation of CVA6 synthesis smoother, we are implementing synthesis error/warning message extraction in the Thales public CI dashboard (https://riscv-ci.pages.thales-invia.fr/dashboard/dashboard_cva6.html).

Since this is still work in progress, here are the summaries of synthesis error/warning logs produced by our prototype filter on your latest push (20a181a). Please let @JeanRochCoulon, @yanicasa and myself (@zchamski) know if they are sufficient for you to work on your code.

Synthesis 32b (cv32a60x):
cva6-bitmanip-synth-errors-32b.g20a181a.txt

Synthesis 64b (cv64a6_imafdc_sv39):
cva6-bitmanip-synth-errors-64b.g20a181a.txt

As soon as the log extractor is stable (and incorporates the required usability adjustments if any) we will announce it in due way.

@fatimasaleem
Copy link
Contributor

fatimasaleem commented Jul 3, 2022

Hi @zchamski
I have made the required changes, can you please trigger the CI checks again? Thanks. cc @JeanRochCoulon

FYI: I have also run the latest arch-tests regression using the core-v-verif infrastructure with the PR-1254, it would be great if we could add these new arch-tests in the CI checks also.

@zchamski
Copy link
Contributor

zchamski commented Jul 4, 2022

Hi @fatimasaleem, the synthesis in the CI looks good! BTW, the Thales public CI is triggered automatically on every Github push to either cva6 or core-v-verif, and you can check the CI results as follows:

  • Go to the Thales CI dashboard landing page at https://riscv-ci.pages.thales-invia.fr/dashboard/
  • Select the CVA6 dashboard:
    dashboard_homepage
  • Select the CI pipeline of interest and click on the PASS/FAIL status button:
    dashboard_cva6_pipelines
  • Select a job of interest (typically 'ASIS Synthesis cv32a60x' or 'ASIC Synthesis cv64a6_imafdc_sv39') and click on its PASS/FAIL button:
    dashboard_pipeline_jobs
  • This will display the log summaries of the given job. For synthesis, the summaries cover total area (in gate equivalents), area-by-category figures (gate equivs and percentage), area-by-package figures (again gate equivs and percentage), and finally, any error/warning messages.
    dashboard_synthesis_job_details_1of2
    dashboard_synthesis_job_details_2of2

Please let the usual suspects (@zchamski, @yanicasa, @JeanRochCoulon) know about your experience of using the Thales CI dashboard.

And to answer an almost certain question: no, there are no direct links to the inside of the dashboard, you have to click your way through...

@fatimasaleem
Copy link
Contributor

Thank you @zchamski for the details. It really is very helpful.

@JeanRochCoulon Since the synthesis and regression run look green, can we merge this? or is there anything else that needs to be done/fixed?

Thanks & Regards,

@JeanRochCoulon
Copy link
Contributor

Hello @fatimasaleem @M-Ijaz-10x @spersvold @zchamski
Good news, no regression has been detected ! I would like to thank you for this very good contribution. After merging this current PR, I will define bitmanip as a HW config parameter in config_pkg.sv. In cv64_imac_sv39, it will be enabled, and in cv32a60x config disabled. I will merge your PR as soon as the github action is pass. Cheers

@JeanRochCoulon
Copy link
Contributor

In parallel, @ASintzoff will have a look to the arc-test suite PR. It would be nice to get it

@JeanRochCoulon JeanRochCoulon merged commit c23eed5 into openhwgroup:master Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants