Add support for instructions and source ranges. by aarlt · Pull Request #15368 · argotorg/solidity

aarlt · 2024-08-28T15:40:19Z

to reduce test-size ethdebug data will now be removed from standard-json based cmdline tests
simple smoke test that is just checking the basic structure of instructions
ethdebug related command line tests that are not using standard-json got removed - removal of ethdebug data is easier with standard-json - correct behaviour of the command line is still checked in soltest test suit

libsolidity/interface/CompilerStack.h

clonker · 2025-01-28T08:12:54Z

libsolidity/interface/CompilerStack.cpp

+			instruction["arguments"] = arguments;
+		}
+
+		solAssert(assembly->codeSections().size() == 1);


Why can there only be one?

only eof can have more

But this should be solUnimplemented not an assert. Or rather, it can be an assert but only if you can assume it will never fail. And for this in all outputs exposed to the user you should have solUnimplemented checks that EOF is not enabled (note that existing outputs already have such checks).

The difference is that the user will get a nice message that this is not supported on EOF rather than an ugly ICE that he will report as a bug. On top of that, with solUnimplemented you can (and should) have a test covering that corner case.

I just changed that to an solUnimplementedAssert. need to add a test.

ok.. simple test added now

libsolidity/interface/CompilerStack.cpp

clonker · 2025-02-07T11:47:51Z

I am still not convinced having 28k lines of test output is a good idea. There is no real way of assessing whether the test result is correct and in case of a test expectation mismatch I imagine one can only really see that "something" changed. Is there a way to make this more concise?

veniger · 2025-02-10T14:42:43Z

Are there issues tracking ethdebug support? Could we create an issue for each part of the spec, so we can track what is supported and what is in development (like: source locations, local variables, etc.) @ekpyron

clonker

First impressions :) I'll have another go at this tomorrow

libsolidity/interface/CompilerStack.cpp

libevmasm/LinkerObject.h

libsolidity/interface/CompilerStack.cpp

test/libsolidity/StandardCompiler.cpp

libsolidity/interface/CompilerStack.cpp

test/cmdlineTests/ethdebug_eof_container_osaka/input.sol

test/libsolidity/StandardCompiler.cpp

cameel

I skimmed through the PR and found some things that looks like bugs and discrepancies with the spec. Is this tested properly?

cameel · 2025-03-02T04:46:29Z

test/cmdlineTests.sh

+        if grep -q "ethdebug" "$stdout_path";
+        then
+          jq --indent 4 '(. | .. | objects | select(has("ethdebug"))) |= (.ethdebug = "<ETHDEBUG DEBUG DATA REMOVED>")' "$stdout_path" > tmp && mv tmp "$stdout_path"
+        fi


Don't assume you can create random junk files in the test dir. It may not even be writable. Use a proper temporary file or capture it in a variable.

And please don't randomly switch to 2-space indents in the middle of the file.

Also, why is grep needed here? If you're trying to ensure that the .ethdebug key exists, this will have false-positives. You should check that using jq.

ok, I will use a proper termporary file for this.

however, I did that grepstuff because our pretty-json is sadly creating a different formatting as jq - so I decided to only do this reformatting if ethdebug is found in the output - I know its not very ellegant. however, maybe we should always use jq for json reformatting - but this should be a different PR.

I meant why use grep if you can check that more reliably with jq. See #15368 (comment).

I did that grepstuff because our pretty-json is sadly creating a different formatting as jq

Ah, right, I did not realize it was messing up the formatting. You should really add comments for such things. But ok, I guess, as ugly as this is, it is unavoidable.

We should at some point extract all this logic for cleaning output files into a separate function or even script. That would at least keep the runner script more compact. But not in this PR in any case.

however, maybe we should always use jq for json reformatting - but this should be a different PR.

Well, I used to not like that idea, but considering the awful regexes we need to cut out the other things, maybe this would actually be lesser evil here.

The annoying thing is that this would reformat all our tests and create a massive PR.

libevmasm/LinkerObject.h

libsolidity/interface/CompilerStack.cpp

cameel · 2025-03-02T05:19:25Z

libsolidity/interface/CompilerStack.cpp

+					{
+						"definition",
+						{{
+							"sources",


Assuming this is from Source range schema, shouldn't this be singular?

Suggested change

"sources",

"source",

Also, what's up with the double braces? Doesn't this create a one-element array or something?

You also have a weird double indent under "contract".

with nlohmann json double braces are needed to ensure that an object is created - if not - depending on the structure - it may become an array

Hmm... is this why you now replaced it all with assignments and manually created objects? :)

libsolidity/interface/CompilerStack.cpp

libevmasm/Assembly.cpp

libevmasm/Ethdebug.cpp

libsolidity/interface/StandardCompiler.cpp

solc/CommandLineInterface.cpp

test/libsolidity/StandardCompiler.cpp

clonker · 2025-03-06T11:23:12Z

libevmasm/Assembly.cpp

 	uint8_t dataRefPush = static_cast<uint8_t>(pushInstruction(bytesPerDataRef));

+	size_t instructionIndex = 0;
 	for (AssemblyItem const& item: items)


A more modern version of manually incrementing the instruction index would be to use ranges::views::enumerate(items) :)

Amounts to something like for (auto const& [instructionIndex, item]: enumerate(items)) { ... }

At first I changed that to ranges::views::enumerate but then I thought it would be better readable to introduce that addInstructionOffset lambda.

I think it would be more readable if you made it side-effect free and instead explicitly passed ret and instructionIndex into the function. The code is easier to understand when you see explicitly what goes in and out.

clonker · 2025-03-06T11:27:41Z

libevmasm/LinkerObject.h

+	/// Vector that stores bytecode offsets & instruction index per instruction.
+	/// offset: (first) points to the beginning of each instruction (including the opcode itself).
+	/// instruction index: (second) instruction index.
+	/// the instruction index can be used to find the related instruction, if a single "complex instruction"
+	/// generated multiple instructions during assembly. This is the case with e.g. AssignImmutable.
+	/// It generates multiple instruction from a single instruction. To be able to keep track of
+	/// the original instruction indices the original instruction index is stored.


Most of that doc can go to the struct

libevmasm/LinkerObject.h

libevmasm/Assembly.cpp

libevmasm/Ethdebug.cpp

libyul/YulStack.cpp

solc/CommandLineInterface.cpp

libevmasm/Ethdebug.cpp

Changelog.md

test/cmdlineTests.sh

ekpyron

I'd actually have a few suggestions for simplifying this in the hot path during assembling, but we can leave that for the second iteration of this. Also a few minor comments in that path, but none crucial.

Other than that, this won't affect regular compilation, so as far as I'm concerned it's good to merge as a first iteration of this.

libevmasm/Assembly.cpp

libevmasm/Assembly.h

libevmasm/Ethdebug.cpp

libevmasm/Assembly.cpp

cameel

Since you're still working on this, I decided I'd add my suggestions as comments after all rather than pushing fixups.

The biggest one is really the one about the helper. I'd really prefer to keep the Assembly changes sane, since it's already such a mess.

But overall, nothing critical. In terms of functionality the PR seems fine. Though TBH I only did a light review and I didn't really review the testing part.

cameel · 2025-03-10T21:55:27Z

libevmasm/Assembly.cpp

+	auto assembleInstruction = [&](auto&& _addInstruction) {
+		size_t start = ret.bytecode.size();
+		_addInstruction();
+		size_t end = ret.bytecode.size();
+		codeSectionLocation.instructionLocations.emplace_back(
+			LinkerObject::InstructionLocation{
+				.start = start,
+				.end = end,
+				.assemblyItemIndex = assemblyItemIndex
+			}
+		);
+	};


Regarding the helper, I had something like this in mind:

Suggested change

auto assembleInstruction = [&](auto&& _addInstruction) {

size_t start = ret.bytecode.size();

_addInstruction();

size_t end = ret.bytecode.size();

codeSectionLocation.instructionLocations.emplace_back(

LinkerObject::InstructionLocation{

.start = start,

.end = end,

.assemblyItemIndex = assemblyItemIndex

}

);

};

auto assembleInstruction = [](LinkerObject const& _linkerObject, Instruction _opcode, bytes const& _immediates, size_t assemblyItemIndex) {

_linkerObject.bytecode += static_cast<uint8_t>(_opcode);

_linkerObject.bytecode += _immediates;

_linkerObject.codeSectionLocations[0].instructionLocations.push_back(

LinkerObject::InstructionLocation{

.start = ret.bytecode.size(),

.end = ret.bytecode.size() 1 + _immediates.size(),

.assemblyItemIndex = assemblyItemIndex

}

);

_linkerObject.codeSectionLocations[0].end += 1 + _immediates.size();

};

I.e. not depending on any external state, just arguments and clearly separating the bytecode from other stuff.

I'd not even make it a lambda - we'll need it in assembleEOF() later, so it should be a normal function, maybe even a class member.

I'd even consider turning it into a class doing what it does currently as side-effect :-). But I'd say we can also wait until we inevitably port it to assemblyEOF to nicen it up.

Well, it would be good to at least make it take bytecode, not a callback.

But yeah, it's not like it's a blocker here. It would be very nice to have, but I approved already because at this point these are not things critical to the functionality.

But yeah, for the record: this is how I'd have done this: develop...ethdebug_instructions_and_source_ranges_refactor
Which includes doing it for EOF (which is just two lines in that version).
Just to keep anything ethdebug-related that's non-critical as separate from the rest as possible (I'd move the helper classes out of the file).
The only uglyiness there is the AssignImmutable case, which unfortunately is irregular compared to all other cases, but it concentrates that uglyness there. But yeah, we can see if we do something like that or some other solution after the release, maybe there's something even better.

libevmasm/Ethdebug.cpp

test/cmdlineTests.sh

libevmasm/LinkerObject.h

cameel

Actually, since all of this is non-critical I'm going to approve already. But it would still be nice if you managed to address these previous comments.

ekpyron

Also reapproving regardless of the remaining comments.

aarlt added the ethdebug label Aug 28, 2024

aarlt force-pushed the ethdebug_instructions_and_source_ranges branch 2 times, most recently from 82e0668 to 51b69fc Compare August 29, 2024 00:50

argotorg deleted a comment from stackenbotten Aug 29, 2024

aarlt force-pushed the ethdebug_instructions_and_source_ranges branch 2 times, most recently from ce130c1 to feda85c Compare August 30, 2024 12:59

aarlt force-pushed the enable_debug_info_ethdebug branch from 93749ff to 0dd51e5 Compare August 30, 2024 14:54

aarlt force-pushed the ethdebug_instructions_and_source_ranges branch from feda85c to fd1bb3a Compare August 30, 2024 15:06

aarlt force-pushed the enable_debug_info_ethdebug branch from 0dd51e5 to 55503ce Compare September 9, 2024 11:28

aarlt force-pushed the enable_debug_info_ethdebug branch 6 times, most recently from 32b21f6 to fe6a4ca Compare September 24, 2024 15:51

aarlt force-pushed the enable_debug_info_ethdebug branch from fe6a4ca to 7917ca3 Compare September 30, 2024 17:50

cameel reviewed Sep 30, 2024

View reviewed changes

libsolidity/interface/CompilerStack.h Outdated Show resolved Hide resolved

aarlt mentioned this pull request Oct 1, 2024

Enable ethdebug debug info and output selection. #15289

Merged

aarlt force-pushed the enable_debug_info_ethdebug branch 6 times, most recently from c885a10 to c4faae4 Compare October 7, 2024 10:56

aarlt force-pushed the ethdebug_instructions_and_source_ranges branch from fd1bb3a to f3205fa Compare October 7, 2024 20:32

aarlt force-pushed the enable_debug_info_ethdebug branch from c4faae4 to c9417ed Compare October 8, 2024 10:32

aarlt force-pushed the ethdebug_instructions_and_source_ranges branch from f3205fa to 3cfe207 Compare October 8, 2024 10:51

aarlt force-pushed the enable_debug_info_ethdebug branch 3 times, most recently from 863ffb2 to 4d01075 Compare October 15, 2024 11:29

aarlt force-pushed the enable_debug_info_ethdebug branch from 0b702ac to 23d9607 Compare January 27, 2025 14:47

aarlt force-pushed the ethdebug_instructions_and_source_ranges branch from df8488a to 104f6a8 Compare January 27, 2025 14:47

aarlt requested a review from clonker January 27, 2025 15:31

aarlt force-pushed the enable_debug_info_ethdebug branch from 23d9607 to 7ea985d Compare January 27, 2025 15:35

Base automatically changed from enable_debug_info_ethdebug to develop January 27, 2025 17:36

aarlt force-pushed the ethdebug_instructions_and_source_ranges branch from 104f6a8 to 8ace2ba Compare January 27, 2025 17:38

clonker reviewed Jan 28, 2025

View reviewed changes

aarlt force-pushed the ethdebug_instructions_and_source_ranges branch 2 times, most recently from 1f48910 to f129ed1 Compare January 31, 2025 19:19

ekpyron added the 🟡 PR review label label Feb 24, 2025

aarlt force-pushed the ethdebug_instructions_and_source_ranges branch 2 times, most recently from 46ca422 to ae4aa88 Compare February 26, 2025 23:30

clonker reviewed Feb 27, 2025

View reviewed changes

clonker reviewed Feb 28, 2025

View reviewed changes

clonker previously approved these changes Feb 28, 2025

View reviewed changes

cameel reviewed Mar 2, 2025

View reviewed changes

clonker reviewed Mar 5, 2025

View reviewed changes

clonker reviewed Mar 6, 2025

View reviewed changes

cameel reviewed Mar 8, 2025

View reviewed changes

test/cmdlineTests.sh Outdated Show resolved Hide resolved

ekpyron previously approved these changes Mar 10, 2025

View reviewed changes

libevmasm/Assembly.cpp Outdated Show resolved Hide resolved

libevmasm/Assembly.cpp Outdated Show resolved Hide resolved

libevmasm/Assembly.h Outdated Show resolved Hide resolved

libevmasm/Ethdebug.cpp Outdated Show resolved Hide resolved

ekpyron previously approved these changes Mar 10, 2025

View reviewed changes

ekpyron reviewed Mar 10, 2025

View reviewed changes

libevmasm/Assembly.cpp Outdated Show resolved Hide resolved

cameel reviewed Mar 10, 2025

View reviewed changes

cameel previously approved these changes Mar 10, 2025

View reviewed changes

ekpyron previously approved these changes Mar 10, 2025

View reviewed changes

Ethdebug instructions and source ranges.

f3705de

ekpyron approved these changes Mar 10, 2025

View reviewed changes

Conversation

aarlt commented Aug 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cameel Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

clonker commented Feb 7, 2025

Uh oh!

veniger commented Feb 10, 2025

Uh oh!

clonker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cameel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cameel Mar 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aarlt Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

aarlt commented Aug 28, 2024 •

edited

Loading

cameel Jan 31, 2025 •

edited

Loading

cameel Mar 2, 2025 •

edited

Loading

aarlt Mar 3, 2025 •

edited

Loading

aarlt Mar 7, 2025 •

edited

Loading

ekpyron left a comment •

edited

Loading

ekpyron Mar 11, 2025 •

edited

Loading