Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ISA documentation #4

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open

Update ISA documentation #4

wants to merge 25 commits into from

Conversation

dthaler
Copy link
Owner

@dthaler dthaler commented Sep 9, 2022

  • Add section numbers
  • Add glossary formatting for fields
  • Use consistent field names
  • Add appendix with opcode table
  • Add ISA version information
  • Add text about underflow, overflow, and division by zero
  • Add other text based on info in other documents

Updates since the LPC talk:

  • Move legacy packet instructions to Linux historical notes
  • Fix specification of divide-by-zero and modulo-zero
  • Note that imm is signed
  • Fix typo in Appendix
  • Add note about invalid instruction 0x8d used by clang -O0
  • Add src column to Appendix
  • Fix construction of imm64
  • Add additional 64-bit immediate instructions

Signed-off-by: Dave Thaler [email protected]

Copy link
Collaborator

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look over the various commits and looks good overall. Some comments inline below. Sorry, I didn't have the time/courage to go through all the dst value in the appendix table.

I'm thinking at this point it would be worth having a script to check whether a program is “valid ebpf-1.0”, and run it on a bunch of real-world programs to check if we missed some instructions (I did my best, but the example of the extended lddw shows how easy it is to miss some). Not sure if I can come up with something this week, I'll see.

``BPF_IND | BPF_W | BPF_LD`` (0x40) means::

R0 = ntohl(*(uint32_t *) (R6->data + src + imm))
*Linux implementation*: Details can be found in the `Linux historical notes <https://github.com/dthaler/ebpf-docs/blob/update/isa/kernel.org/linux-historical-notes.rst#legacy-bpf-packet-access-instructions>`_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a script to update this URL between this repo and the kernel docs? It seems easy to forget to update this link before posting to the ML :)

Formatting nit: I find it easier to have long links like this referenced in the paragraph but defined below it. Would also make it easier to spot them for updating them by the way.

isa/kernel.org/instruction-set.rst Outdated Show resolved Hide resolved
isa/kernel.org/instruction-set.rst Outdated Show resolved Hide resolved
isa/kernel.org/instruction-set.rst Outdated Show resolved Hide resolved
isa/kernel.org/instruction-set.rst Outdated Show resolved Hide resolved
@qmonnet
Copy link
Collaborator

qmonnet commented Sep 22, 2022

Latest changes look good to me. No further comment from my side at this time :).

0x15 0x0 any if dst == imm goto +offset `Jump instructions`_
0x16 0x0 any if (uint32_t)dst == imm goto +offset `Jump instructions`_
0x17 0x0 any dst -= imm `Arithmetic instructions`_
0x18 0x0 0x00 dst = imm64 `64-bit immediate instructions`_
Copy link
Collaborator

@qmonnet qmonnet Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started a script to extract this table into a JSON list of instructions so I can run it against ELF files - I get complaints for this one. imm can obviously take any value on this line and the other 0x18 below :).

Should we have the same columns for dst and offset, by the way?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started a script to extract this table into a JSON list of instructions so I can run it against ELF files - I get complaints for this one. imm can obviously take any value on this line and the other 0x18 below :).

Fixed, thanks.

Should we have the same columns for dst and offset, by the way?

My view: only if there are opcodes whose meaning varies by the value of dst or offset, which I don't believe is the case currently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more in terms of which instruction should have these fields always set to 0, and which don't have this constraint. My motivation was that it would help check further with the script. But this is maybe not sufficient to justify adding all columns to the table, so OK.

@dthaler dthaler force-pushed the update branch 3 times, most recently from 4759259 to 4f14f30 Compare September 22, 2022 20:37
eBPF program execution would result in division by zero,
the destination register is instead set to zero.
If execution would result in modulo by zero,
the destination register is instead set to the source value.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the same as saying the destination is set to zero? The only case where this would trigger modulo by zero is if the source is zero.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks for catching

Signed-off-by: Dave Thaler <[email protected]>
* Add section numbers
* Add glossary formatting for fields
* Use consistent field names
* Add appendix with opcode table
* Add ISA version information
* Add text about underflow, overflow, and division by zero
* Add other text based on info in other documents

Signed-off-by: Dave Thaler <[email protected]>
Also specify that div-by-zero results in zero
and mod-zero results in the source value

Signed-off-by: Dave Thaler <[email protected]>
Note that imm is signed
Fix typo in Appendix

Signed-off-by: Dave Thaler <[email protected]>
As reported by Shung-Hsi Yu

Signed-off-by: Dave Thaler <[email protected]>
In preparation for adding instructions that use it

Signed-off-by: Dave Thaler <[email protected]>
Signed-off-by: Dave Thaler <[email protected]>
Signed-off-by: Dave Thaler <[email protected]>
@dthaler dthaler force-pushed the update branch 3 times, most recently from c0145f4 to a9d4e2d Compare September 23, 2022 00:58
Signed-off-by: Dave Thaler <[email protected]>
Signed-off-by: Dave Thaler <[email protected]>
Signed-off-by: Dave Thaler <[email protected]>
BPF_AND 0x50 dst &= src
BPF_LSH 0x60 dst <<= src
BPF_RSH 0x70 dst >>= src
BPF_NEG 0x80 dst = ~src
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants