Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A batch of improvements for the '__emit' operator #421

Merged
merged 21 commits into from
Jun 15, 2019

Conversation

Daniel-Cortez
Copy link
Contributor

@Daniel-Cortez Daniel-Cortez commented May 4, 2019

What this PR does / why we need it:

  • Added 'universal' pseudo-opcodes, designed to be primarily used in macros.
    load.u.pri/alt - load value (may be an expression) into PRI/ALT.
    stor.u.pri/alt - store value in PRI/ALT into the operand.
    addr.u.pri/alt - load operand address into PRI/ALT.
    push.u - push value (may be an expression) into the stack.
    push.adr.u - obtain operand address and push it into the stack.
    zero.u - write value 0 into the operand.
    inc.u - increase the value stored in the operand by 1.
    dec.u - decrease the value stored in the operand by 1.

    Example:

#define getval(%0) __emit(load.u.pri %0)
#define addrof(%0) __emit(addr.u.pri %0)

const global_const = 0x1234;
new global_var = 0x5678;

main()
{
	const local_const = 0x4321;
	new local_var = 0x8765;
	static local_static_var = 0x9ABC;
	static local_static_array[2];

	printf("%08x", getval(global_const)); // 00001234
	printf("%08x", getval(global_var)); // 00005678
	printf("%08x", getval(local_const)); // 00004321
	printf("%08x", getval(local_var)); // 00008765
	printf("%08x", getval(local_static_var + global_const)); // 0000ACF0
	printf("%08x", addrof(global_var)); // 00000000
	printf("%08x", addrof(local_var)); // 00004094
	printf("%08x", addrof(local_static_var)); // 00000004
	printf("%08x", addrof(local_static_array[1])); // 0000000C
}
  • The compiler now issues an error if the stack offset/data address isn't a multiple of cell size.
    NOTE: For this purpose I reused error 011 which previously wasn't used anywhere.
__emit load.s.pri 5; // error 011: stack offset/data address must be a multiple of cell size
  • Fixed crash (Crash if unused emit native #412) when referencing a previously unused native function.

  • Removed index check for operands of lctrl and sctrl.

  • Opcodes stor.s.pri/alt, inc.s, dec.s, and zero.s don't accept variables and arrays passed by reference anymore.

  • Allowed negative offsets for arguments of type 'local variable' (Operator emit negative values #413).

  • Added tests.
    NOTE: Since the compiler can only display up to 26 errors, I had to split the tests into 7 different *.pwn files (+ the extra one to check the generated bytecode).

  • Fixed a lot of minor bugs discovered with the newly added tests.

Which issue(s) this PR fixes:

Closes #346
Fixes #412
Fixes #413

What kind of pull this is:

  • A Bug Fix
  • A New Feature
  • Some repository meta (documentation, etc)
  • Other

Additional Documentation:

… arguments) to 'stor.s.pri/alt', 'inc.s', 'dec.s' and 'zero.s'
… sign

Example:
  L1:
  __emit const.pri -L1;
Before:
  error 001: expected token: "-any value-", but found "-L1"
After:
  error 001: expected token: "-any value-", but found "-(-label-)"
@Daniel-Cortez Daniel-Cortez requested a review from a team as a code owner May 4, 2019 18:37
@Daniel-Cortez
Copy link
Contributor Author

There are a few things I'm still not sure about:

  • Array handling is only done for pseudo-opcodes load.u.pri/alt and push.u - this is because for these opcodes I reused the standard expression parser (function expression() from sc3.c).
    For the other pseudo-instructions the operand is not an expression but an lvalue, so it's required to know its address, and I couldn't find any existing functionality for this task in sc3.c.
    As a temporary solution I used function expression() for those opcodes as well: if the user specifies a variable or a reference (iVARIABLE/iREFERENCE) the function returns the corresponding symbol *, and it's possible to know its address (sym->addr), but if the user specifies an array cell (iARRAYCELL; e.g. arr[0]), there seems to be no way to obtain its address.
    I have no idea how to implement array handling for those opcodes in a good way, without reinventing the wheel, so any help or suggestions for this would be appreciated.

  • Currently instructions casetbl and case can be used in single-instruction __emit statements and in expressions, but it's easy to make the compiler generate invalid code:

__emit casetbl 2 lbl_default;
__emit case 0 lbl_case0;
new x = 0; // This line will become a part of the case table, which is obviously wrong
__emit case 1 lbl_case1;

I think it would be better to only allow the use of those instructions in block mode, like this:

__emit
{
	casetbl 2 :lbl_default
	case 0 :lbl_case0
	case 1 :lbl_case1
lbl_default:
	// ...
lbl_case0:
	// ...
lbl_case1:
	// ...
}

That way it would be impossible to put any other code inbetween case table entries, but this would require adding a new error and I'm not sure how to phrase such error (not a native English speaker) - any help/suggestions on how to properly phrase it would be nice.

If and when this is done, it would be possible to improve control over the case entries:

  • Check if the number of entries matches the number specified in the casetbl instruction.
  • Make sure the entries in the table are sorted by value (in the Implementer's Guide it's said they must be sorted so the Abstract Machine could use binary search on them).

@Daniel-Cortez
Copy link
Contributor Author

Daniel-Cortez commented May 28, 2019

Ok, I reimplemented the pseudo-opcodes, now all of them can handle array access.
The only exception is that array characters (e.g. array{n}) are not allowed in push.u.adr: I made it that way, because normally array characters can't be passed to functions by reference.

I still need help with formulating the error messages in order to improve control over case table entries made with __emit (see the previous post).
But if that feature isn't needed much, then I suppose this PR should be ready to be merged.

Apparently shifting a 32-bit signed value by 31 bits is UB...
@Zeex Zeex merged commit 8561b38 into pawn-lang:dev Jun 15, 2019
@Zeex
Copy link
Contributor

Zeex commented Jun 15, 2019

Looks good 👍 though I'm not well versed in the new __emit stuff

@Daniel-Cortez Daniel-Cortez deleted the emit-3 branch June 15, 2019 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants