Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-98831: Remove super-instruction definitions, use macro instructions instead #100124

Closed
wants to merge 2 commits into from

Conversation

gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented Dec 8, 2022

Replace all super-instructions with macros, using a special JOIN op to extract the next oparg. JOIN has a cache effect of one word that is equivalent to bumping next_instr.

Rip out all code for parsing and generating super-instructions.

@brandtbucher
Copy link
Member

Not gonna lie, the PR title scared me.

@gvanrossum gvanrossum changed the title GH-98831: Remove super() GH-98831: Remove super-instruction definitions, use macro instructions instead Dec 9, 2022
@gvanrossum
Copy link
Member Author

Clearly I should use more provocative PR titles more often. :-)

Comment on lines 92 to 95
#define _COMPARE_OP_FLOAT 1003
#define _COMPARE_OP_INT 1004
#define _COMPARE_OP_STR 1005
#define _JUMP_IF 1006
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these just be zeroes, too? Or even -1, just to emphasize that they aren't real?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they should all be zeros. Since the value is never used I'm not sure what the point of -1 would be.

TARGET(LOAD_FAST__LOAD_FAST) {
PyObject *_tmp_1;
PyObject *_tmp_2;
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random idea... any chance we could annotate the output of macros to make it easier to see what the different parts correspond to? So, like, this line would be:

Suggested change
{
{ // LOAD_FAST

And we would have { // JOIN and another { // LOAD_FAST below? Not sure how hard this is.

@@ -115,6 +117,14 @@ dummy_func(
switch (opcode) {

// BEGIN BYTECODES //

op(JOIN, (word/1 --)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. It feels weird to treat the next instruction as a cache entry.

Why don't all macro components generate code to reload the incremented opcodes/opargs unconditionally? That way we wouldn't need JOIN at all. Or are we worried the compiler can't optimize it out when it's unused (or that garbage values could be too confusing)? Or does it mess up something else?

Perhaps there's a clean way of expressing in the DSL whether an op corresponds to a real instruction or not? Then we can just reload the opcode/oparg if that's true, and get rid of JOIN.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. It feels weird to treat the next instruction as a cache entry.

It's only weird if you think of the instruction stream as alternating instructions and cache entries. In the discussion about registers it was already proposed to use a "cache" entry to encode the operation (ADD, MUL, etc.) for BINARY_OP, so it's really just a stream of variable-length instructions. But yeah, this op is definitely full of internal hackery.

Why don't all macro components generate code to reload the incremented opcodes/opargs unconditionally? That way we wouldn't need JOIN at all. Or are we worried the compiler can't optimize it out when it's unused (or that garbage values could be too confusing)? Or does it mess up something else?

I worry more about where the next oparg would be loaded from. The code generator makes assumption about where next_instr points (for macros it keeps pointing just past the original instruction until the end, for supers it gets bumped for each component).

Perhaps there's a clean way of expressing in the DSL whether an op corresponds to a real instruction or not? Then we can just reload the opcode/oparg if that's true, and get rid of JOIN.

That's not a bad idea -- because we actually have that way: inst vs op. The main downside would then be that there would no longer be a clue in the macro whether it defines a super or not.

Comment on lines -229 to -234
if (tkn := self.expect(lx.IDENTIFIER)) and tkn.text == "super":
if self.expect(lx.LPAREN):
if tkn := self.expect(lx.IDENTIFIER):
if self.expect(lx.RPAREN):
if self.expect(lx.EQUALS):
if ops := self.ops():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kill it with fire! ;)

@python python deleted a comment from netlify bot Dec 9, 2022
Copy link
Member Author

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm feeling a lot of hesitation about this PR. Maybe we should just leave things as they were and focus on other work (e.g. finish converting more instructions to the explicit stack/cache effects form, and adding arrays).

Comment on lines 92 to 95
#define _COMPARE_OP_FLOAT 1003
#define _COMPARE_OP_INT 1004
#define _COMPARE_OP_STR 1005
#define _JUMP_IF 1006
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they should all be zeros. Since the value is never used I'm not sure what the point of -1 would be.

@@ -115,6 +117,14 @@ dummy_func(
switch (opcode) {

// BEGIN BYTECODES //

op(JOIN, (word/1 --)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. It feels weird to treat the next instruction as a cache entry.

It's only weird if you think of the instruction stream as alternating instructions and cache entries. In the discussion about registers it was already proposed to use a "cache" entry to encode the operation (ADD, MUL, etc.) for BINARY_OP, so it's really just a stream of variable-length instructions. But yeah, this op is definitely full of internal hackery.

Why don't all macro components generate code to reload the incremented opcodes/opargs unconditionally? That way we wouldn't need JOIN at all. Or are we worried the compiler can't optimize it out when it's unused (or that garbage values could be too confusing)? Or does it mess up something else?

I worry more about where the next oparg would be loaded from. The code generator makes assumption about where next_instr points (for macros it keeps pointing just past the original instruction until the end, for supers it gets bumped for each component).

Perhaps there's a clean way of expressing in the DSL whether an op corresponds to a real instruction or not? Then we can just reload the opcode/oparg if that's true, and get rid of JOIN.

That's not a bad idea -- because we actually have that way: inst vs op. The main downside would then be that there would no longer be a clue in the macro whether it defines a super or not.

@gvanrossum
Copy link
Member Author

Off-line we decided not to do this. Super-instructions may eventually disappear (when we have a register VM).

@gvanrossum gvanrossum closed this Dec 12, 2022
@gvanrossum gvanrossum deleted the remove-supers branch June 13, 2023 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants