-
Notifications
You must be signed in to change notification settings - Fork 34
Implement the translator code in C/C++ #153
Conversation
evm2wast.c
Outdated
| } | ||
| return ret; | ||
| } | ||
| int evm2wasm(char *evm_code, size_t len, char *wast_code, size_t wast_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called evm2wast. Also I think wast_size should be a pointer updated with the final length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wast_code must be a pointer of a pointer if you want to use realloc below. Alternatively return the modified pointer (or null) instead of 0/-1.
evm2wast.c
Outdated
| continue; | ||
| } | ||
| wast_chunk = gadgets[(int)evm_code[i]]; | ||
| wast_code = realloc(wast_code, strlen(wast_code)+strlen(wast_chunk) + 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no checks for allocation failures.
|
Need to change |
evm2wast.h
Outdated
| #ifndef __EVM2WAST_H | ||
| #define __EVM2WAST_H | ||
| #include <stdlib.h> | ||
| #include "gadgets.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does gadgets needs to be included in the external interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove it in that case?
evm2wast.h
Outdated
| #define __EVM2WAST_H | ||
| #include <stdlib.h> | ||
| #include "gadgets.h" | ||
| enum opcodes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed in the external interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only if you want to access opcodes somewhere after evm2wast output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove it in that case?
evm2wast.h
Outdated
| STATICAL, REVERT, | ||
| SELFDESTRUCT = 0XFF | ||
| }; | ||
| extern int evm2wasm(char *evm_code, size_t len, char *wast_code, size_t wast_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be extern "C" {}.
evm2wast.c
Outdated
| #include "evm2wast.h" | ||
| #include "gadgets.h" | ||
| #include "util.h" | ||
| #define digits(x) (floor(log10(abs(x))) + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like inline functions better because type matching is worse with such complex macros.
For example I get this:
warning: taking the absolute value of unsigned type 'size_t' (aka 'unsigned long') has no effect [-Wabsolute-value]
note: remove the call to 'abs' since unsigned values cannot be negative
|
My commit collapsed all the comments, but most of them are still relevant - please check them out. |
evm2wast.c
Outdated
| } | ||
| char *with_segments = assemble_segments(jump_segments, 4); | ||
| wast_code = realloc(wast_code, strlen(wast_code) + strlen(with_segments)); | ||
| if(wast_code == 0) goto err; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: realloc doesn't free the original memory space on failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only if the first param and the return value is different
b = realloc(a, size) will leave a untouched and set b to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
realloc has no understanding what the lvalue is.
See the man pages:
For realloc(), the input pointer is still valid if reallocation failed. For reallocf(), the input pointer will have been freed if reallocation failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The realloc() function returns a pointer to the newly allocated memory,
which is suitably aligned for any built-in type and may be different from
ptr, or NULL if the request fails
....
If realloc()
fails, the original block is left untouched; it is not freed or moved.
implementation dependent but as the README.md states we go with glibc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not follow.
For one, doesn't matter what the README says, you cannot claim "go with glibc" since it will definitely not use glibc when compiling as a contract.
Second, I think it is clear what the manual is saying. The input pointer passed to realloc is not freed upon failure.
|
Please do not remove the Javascript code and rebase this against master. |
|
Please write a shell script (and CI entry) which
They should be identical. Then we can just rely on @cdetrio's tests on evm2wasm.js. Later we can run @cdetrio's tests with running each twice and transpiling them with both evm2wasm.js and evm2wasm.cpp. |
|
Include |
|
Oh we need to include running |
gen_gadgets.sh
Outdated
| @@ -0,0 +1,99 @@ | |||
| cat <<EOF > gadgets.h | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be a c file instead? I'm not a big fan including giant data structures in headers because they are pulled in every time the header is included.
Probably generating gadgets.c with the content and gadgets.h with the reference would be a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The header can be something like:
#ifndef __EVM2WASM_GADGETS_H
#define __EVM2WASM_GADGETS_H
/* Declared in gadgets.c */
char *gadgets[256];
#endif
evm2wast.c
Outdated
| return wasm; | ||
| err: | ||
| free(brtable); | ||
| free(wasm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could result in freeing NULL in the case of brtable or wasm allocation fails.
evm2wast.c
Outdated
| err: | ||
| free(brtable); | ||
| free(wasm); | ||
| free(wasm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a double free here.
|
Do you want to leave a bit more space in the code, e.g. between |
evm2wast.c
Outdated
| #include <stdarg.h> | ||
| #include "evm2wast.h" | ||
| #include "gadgets.h" | ||
| int digits(unsigned long x) { return (floor(log10(abs(x))) + 1); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be static if we need this helper. Also why place it above the global variables and not next to the functions?
evm2wast.h
Outdated
|
|
||
| enum op_nums { | ||
| STOP = 0x00, | ||
| ADD, MUL, SUB, DIV, SDIV, MOD, SMOD, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to keep all these as one-per-line? Makes new additions much easier to spot.
evm2wast.h
Outdated
| STOP = 0x00, | ||
| ADD, MUL, SUB, DIV, SDIV, MOD, SMOD, | ||
| ADDMOD, MULMOD, EXP, SIGNEXTEND, | ||
| LT = 0X10, GT, SLT, SGT, EQ, ISZERO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you keep using lowercase x in the hex values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
|
@Silur can you squash all this branch down to a single commit and rebase it? And make sure afterwards not to touch the javascript files and just exclude the old makefile. |
|
This is minimal change to run gen_gadgets.sh from cmake: diff --git a/CMakeLists.txt b/CMakeLists.txt
index 42d408f..6f3cc96 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -11,9 +11,16 @@ project(evm2wasm)
include(ProjectBinaryen)
+add_custom_target(gen_gadgets
+ ${PROJECT_SOURCE_DIR}/gen_gadgets.sh
+ WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
+)
+
add_executable(evm2wasm
evm2wasm.cpp
evm2wast.c
+ gadgets.c
)
+add_dependencies(evm2wasm gen_gadgets)
target_link_libraries(evm2wasm PRIVATE binaryen::binaryen)
\ No newline at end of file
diff --git a/gen_gadgets.sh b/gen_gadgets.sh
index 46b2c72..520411a 100755
--- a/gen_gadgets.sh
+++ b/gen_gadgets.sh
@@ -1,3 +1,5 @@
+#!/bin/bash
+
cat <<EOF > gadgets.c
#ifndef __EVM2WASM_GADGETS_H
#define __EVM2WASM_GADGETS_H |
| return wasm; | ||
| err: | ||
| perror("Falied to allocate memory for assemble_segments"); | ||
| return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory isn't freed in case of an error?
evm2wasm.cpp
Outdated
| wasm::Fatal() << "error in parsing input"; | ||
| } | ||
|
|
||
| // FIXME: perhaps call validate() here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this?
evm2wasm.cpp
Outdated
| fseek(fd, 0, SEEK_END); | ||
| size_t offset = ftell(fd); | ||
| rewind(fd); | ||
| char *code = (char*)malloc(8192); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If offset is the file size, then why not use it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yea I forgot to use that :D
evm2wast.c
Outdated
| char *check = ""; | ||
| char *template = ""; | ||
| char *check = malloc(1); | ||
| char *template = malloc(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really fond of malloc(1) and malloc(0) but realloc(NULL, size) is perfectly valid so these can be set to NULL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't because below we use strlen on them what will cause a segfault
| add_executable(evm2wasm | ||
| evm2wasm.cpp | ||
| evm2wast.c | ||
| gadgets.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation is off
|
Please read #153 (comment) and please mark every |
libs/evm2wasm/evm2wast.c
Outdated
| char *bytes = padleft(evm_code+1, evm_code[i]-0x5f, 32); | ||
| i+=(size_t)evm_code[i]-0x5f; | ||
| if(!bytes) goto err; | ||
| int bytes_rounded = (int)ceil((double)(evm_code[i-1]/8)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can, please avoid using float/double and math ops.
(evm_code[i - 1] + 7) / 8 should have the same effect.
libs/evm2wasm/evm2wast.c
Outdated
| } | ||
| return ret; | ||
| } | ||
| static int64_t bytes2long(char *bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bytes can be const here
| if(!ret) | ||
| { | ||
| perror("allocation error "); | ||
| return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't really freeing the memory?
libs/evm2wasm/evm2wast.c
Outdated
| return 0; | ||
| } | ||
|
|
||
| static char *add_stack_check(char *segment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is realloc bellow and this pointer is lost for the caller.
libs/evm2wasm/evm2wast.c
Outdated
| stack_delta = 0; | ||
| return segment; | ||
| err: | ||
| free(template); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String literals (as they may be placed into read-only sections) cannot be freed. You assign string literals to template above.
libs/evm2wasm/evm2wast.c
Outdated
| } | ||
| return --i; | ||
| } | ||
| char *build_module(char *wast) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wast can be const
| return ret; | ||
| }*/ | ||
|
|
||
| static int index_of(char **table, char *elem, int len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elem can be const
libs/evm2wasm/evm2wast.c
Outdated
| static char *add_stack_check(char **segment) | ||
| { | ||
| char *check = malloc(1); | ||
| char *template = malloc(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two are never freed.
|
Please do a rebase and perhaps squash it down? |
libs/evm2wasm/evm2wast.c
Outdated
| sprintf(*wast, "(call $useGas (i64.const %ld)) %s", gas_count, *segment); | ||
| *segment = ""; | ||
| sprintf(*wast, "%s (call $useGas (i64.const %ld)) %s", *wast, gas_count, *segment); | ||
| free(*segment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this free the caller will have an invalid pointer.
libs/evm2wasm/evm2wast.c
Outdated
| } | ||
| sprintf(*wast, "%s (call $useGas (i64.const %ld)) %s", *wast, gas_count, *segment); | ||
| free(*segment); | ||
| *segment = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will put a literal into a pointer which is freed later on and throws away the pointer which needs to be freed (memory leak).
|
I'd suggest try using |
| case PUSH29: | ||
| case PUSH30: | ||
| case PUSH31: | ||
| case PUSH32: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use PUSH1 .. PUSH32 in gnuc99 :)
| @@ -0,0 +1,55 @@ | |||
| #!/bin/bash | |||
| cstr () { | |||
| sed 's/;;.*//g' $1 2>/dev/null | sed '$!s/$/\\n\\/' | sed 's/"/\\"/g' | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix this so it can handle empty trailing new lines.
|
If you still have memory issues, build with Address Sanitizer (GCC, clang): |
Reimplement the translator in C minor memory fixes shitty size fixes shitty size fixes remove debug printf
|
I don't think you're actually doing a rebase because running a rebase on this results in a lot of conflicts also there seem to be quite a few back and forth changes. |
|
Pushed a rebase into the Also be reenabling the compiler warnings, it spits out a lot of potential cases for memory corruption (and their potential solutions): |
|
@chfast I already hooked up |
|
Superseded by #208. |
Part of #4.