Skip to content

Add evm disassembly support.#3595

Closed
gogo2464 wants to merge 2 commits intorizinorg:devfrom
gogo2464:evm
Closed

Add evm disassembly support.#3595
gogo2464 wants to merge 2 commits intorizinorg:devfrom
gogo2464:evm

Conversation

@gogo2464
Copy link

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository
  • I made sure to follow the project's coding style
  • I've documented or updated the documentation of every function and struct this PR changes. If not so I've explained why.
  • I've added tests that prove my fix is effective or that my feature works (if possible)
  • I've updated the rizin book with the relevant information (if needed)

Detailed description

Test plan

At this moment, there is mostly only the disassembly plugin that is under work.
Based on dead code. I will remove the dead code.
I have not added enough testing yet. I need to add more before merging.
I will add the plugin commit diff in the rizin book about how to code plugin with meson.

Closing issues
...

@gogo2464
Copy link
Author

Please do not review dead code.

I have issues at this line because 2 lines are printed:
6061 =>

PUSH1 0x61
PUSH2

instead of only PUSH1 0x61

What could I do please?

Copy link
Member

@ret2libc ret2libc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you pass 6061, evm_op[0x60].type is GB_8BIT, thus gbOpLength(evm_op[0x60].type) returns 1, meaning that the disassembled instruction has length 1. However, given that PUSH1 has an argument specified in the next byte, it should have a length 2. That's why the 61 after the 60 is interpreted as its own instruction.

@gogo2464 gogo2464 force-pushed the evm branch 5 times, most recently from bbfd5d3 to c9d5c48 Compare June 23, 2023 12:44
@gogo2464
Copy link
Author

The testing allows me to find a bug on CALLDATACOPY lol

@gogo2464
Copy link
Author

Using tsting I realized:

-PUSH3 0x010203
+PUSH3 0x123

What should I do in your opinion @ret2libc or what else ?

@gogo2464
Copy link
Author

I have made the testing from:

echo 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff | evmasm -d

@XVilka
Copy link
Member

XVilka commented Jun 23, 2023

@gogo2464 have you seen this? #3368

@gogo2464
Copy link
Author

@gogo2464 have you seen this? #3368

I have never ever seen this... this is terrible!!! I am going to check which version is better and where for both.

I think my disassembler is more tested and then more accurate than the old one. See my testing file. I also need to write my program for a school project. Merging it is not mandatory but could be a good plus.

Need your opinion.

@gogo2464
Copy link
Author

The analysis part of the other plan sounds good as well.

@gogo2464
Copy link
Author

I suggest to keep my disassemble plugin but keep his analysis plugin.

@gogo2464
Copy link
Author

@XVilka

@gogo2464
Copy link
Author

My code is very very well tested. There is no missing instruction .

@gogo2464 gogo2464 force-pushed the evm branch 2 times, most recently from 3cc0c1f to 6ff0ee1 Compare June 23, 2023 22:35
Comment on lines 162 to 161
instruction_size = 5;
strncpy(out, buf+1, instruction_size-1);
rz_hex_bin2str(out, instruction_size-1, opcode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if len is less than instruction_size ? this will create an out of bound read. you need to add checks to verify you have enough buffer before running rz_hex_bin2str and if there is not, you should return an invalid instruction.

@gogo2464
Copy link
Author

sounds good to me

@gogo2464 gogo2464 marked this pull request as ready for review July 19, 2023 12:34
@gogo2464
Copy link
Author

All is very very finely tested. May be ready for a merge.

All the instructions can be assembled/disassembled except PUSH32 that assembles with:

7f0102030405060708090102030405060708090102030405060708090102030421

instead of

7f0102030405060708090102030405060708090102030405060708090102030405

You may help me or you may merge directly.

@gogo2464 gogo2464 force-pushed the evm branch 2 times, most recently from aa3a495 to d710868 Compare July 20, 2023 10:43
@gogo2464
Copy link
Author

Could you merge before the 21 at 4pm london utc please? It could be a great plus for a school project. Any comment is welcome before this date.

Comment on lines +19 to +25
if (!strcmp("stop", buf)) {
opbuf[0] = 0x00;
} else if (!strcmp("add", buf)) {
opbuf[0] = 0x01;
} else if (!strcmp("sub", buf)) {
opbuf[0] = 0x03;
} else if (!strcmp("div", buf)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i strongly suggest to make a structure with these bytes, it is easier to read and you can loop easily.

Comment on lines +144 to +151
char out[100];
int number = atoi(buf+4);
switch (number) {
case 1:
len = 2;
opbuf[0] = 0x60;
break;
case 2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can optimize this out by using a structure like suggested before which contains name, length, and the first byte.

Comment on lines +279 to +291
for (int j = 8, i = 0; i < len-1; j += 2, i ++) {
two_chars_str[0] = buf[j];
two_chars_str[1] = buf[j+1];
two_chars_str[2] = '\0';
opbuf[i+1] = strtol(two_chars_str, NULL, 16);
}
} else if (number > 9) {
for (int j = 9, i = 0; i < len-1; j += 2, i ++) {
two_chars_str[0] = buf[j];
two_chars_str[1] = buf[j+1];
two_chars_str[2] = '\0';
opbuf[i+1] = strtol(two_chars_str, NULL, 16);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are methods in rizin to convert hex to bytes. also you don't really need to use strtol for doing that.

#include <string.h>
#include <stdlib.h>

static int evmAsm(RzAsm *a, RzAsmOp *op, const char *buf) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You always assume that buf contains the correct information, but what happens if its length is less than expected? based on your code, many out of bound reads. you should always check for boundaries.

const unsigned char opcode[256];
switch (buf[0]) {
case 0x00:
rz_str_cpy(buf_asm, "STOP") break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to use rz_str_cpy when you have to push data into RzStrBuf.
just do rz_strbuf_set for static strings and rz_strbuf_appendf for adding extra info to the string.

Comment on lines +219 to +220
instruction_size = 21;
rz_hex_bin2str(buf+1, instruction_size-1, opcode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to always check if the buffer is big enough. you cannot just read it as is and assume it will always be.

@@ -0,0 +1,359 @@
// SPDX-FileCopyrightText: 2023 gogo <gogo246475@gmail.com>
// SPDX-FileCopyrightText: 2012-2020 pancake <pancake@nopcode.org>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove pancake. you wrote this, he didnt.

@@ -0,0 +1,374 @@
// SPDX-FileCopyrightText: 2023 gogo <gogo246475@gmail.com>
// SPDX-FileCopyrightText: 2013-2018 pancake <pancake@nopcode.org>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove pancake, you wrote this, not him.

Copy link
Member

@wargio wargio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check always if the buffer is long enough. this code has many security vulnerabilities.

@gogo2464
Copy link
Author

I will review this week.

@gogo2464
Copy link
Author

sure let me fix some issues.

@XVilka
Copy link
Member

XVilka commented Jan 11, 2024

I am closing this because of no activity. Please open a new one if you decide to continue.

@gogo2464
Copy link
Author

@XVilka I understand. I was confused with the table. Can I see this month if I will continue and rebase it please?

@wargio
Copy link
Member

wargio commented Jan 12, 2024

yes sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants