Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup - remove unused parameter with magic numbers #22871

Open
wants to merge 2 commits into
base: blead
Choose a base branch
from

Conversation

happy-barney
Copy link

NewOp's first parameter is unused since 2007, polluting codebase with magic numbers.

PR is adding new macro NewOp_v542 without this unused parameter.

Macro name is also POC of adding Perl version when macro will be added to maintain reasonable
backward compatibility by version string in symbol name.

@iabyn
Copy link
Contributor

iabyn commented Dec 23, 2024 via email

@happy-barney
Copy link
Author

happy-barney commented Dec 23, 2024

I can't see what practical gain you get by including the version number in the new macro. It just seems to make it worse.

Symbol name is describes behaviour. With new version behavior changed (3 parameters instead of 4).

It's same like with humans. Look for example at French kings, there were for example:

  • Louis XIV
  • Louis XVI

They have version in names as well, so you can distinguish one born in 17th century and another in 18th.

Same is for this pattern - you can introduce new symbol, new behaviour and still provide old one so there is no need to modify every old code simultaneously.

@iabyn
Copy link
Contributor

iabyn commented Dec 23, 2024 via email

@happy-barney
Copy link
Author

But the version number tells you nothing about the behaviour.

Well, difference is that I'm using release numbers and not version per symbol.
There it tells you one important thing - when it was introduced. Rest is matter of documentation.

I'm not using version to provide different behaviour with of same symbol. It will still be NewOp behaviour
with two implementations:

  • as implemented before v5.42
  • as implemented in v5.42

It may not tell much about actual behaviour, but it tells which version of behaviour and it ensures that it will be preserved.

Following this pattern ensures 100% forward compatibility (eg: no need to recompile XS with each release).

It will even (with some tweaks) allow to propagate newer grammar into older versions - this is root assumption behind idea of exact use VERSION - https://github.com/happy-barney/perl-wish-list/blob/master/exact-use-version/spec/spec.md

Branislav Zahradník added 2 commits December 24, 2024 13:13
NewOp's first argument has been unused since 2007 but remains required,
polluting the codebase with magic numbers.

A new macro is provided that eliminates these magic numbers while maintaining
backward/forward compatibility through Perl version identification in its name.
First parameter of NewOpSz is unused since 2007.
@happy-barney happy-barney force-pushed the hpb/unused-magic-numbers branch from 0032ed0 to 63ab1d4 Compare December 24, 2024 12:13
@iabyn
Copy link
Contributor

iabyn commented Dec 27, 2024 via email

@happy-barney
Copy link
Author

On Mon, Dec 23, 2024 at 05:03:49AM -0800, Branislav Zahradník wrote: Following this pattern ensures 100% forward compatibility (eg: no need to recompile XS with each release).
The big problem with XS binary compatibility is data structure definitions and alignments, I don't see how a new function naming scheme is going to avoid that,

structs and typedefs can be versioned as well.
For example:

  • struct mgvtbl_v510
  • struct mgvtbl_v520
  • struct mgvtbl_v530

including versioniong for each function using them.

such structures should be opaque and provide versioned accessors.

It will even (with some tweaks) allow to propagate newer grammar into older versions - this is root assumption behind idea of exact use VERSION - https://github.com/happy-barney/perl-wish-list/blob/master/exact-use-version/spec/spec.md
Like 99% of your proposals, I have no real idea of what you're proposing, but a quick look at that file sounds like something insanely complex for little gain,.

Why do you think it is insanely complex?

Gain is lowering TCO of codebase. For some it is few cents for others it may be millions.

Another gain is that it will be possible to build compatibility layers so you will be for example able to install "use v5.50" library on your "use v5.42" box. And vice versa, you will be able to use old syntax in new versions - ie, you can introduce incompatible changes without loosing capability to run older code.


-- Any [programming] language that doesn't occasionally surprise the novice will pay for it by continually surprising the expert. -- Larry Wall

@iabyn
Copy link
Contributor

iabyn commented Dec 30, 2024 via email

@happy-barney
Copy link
Author

re maintain struct:

no, there will be no structure maintenance, since structures are version locked and should not be changed.
Also, new structure and functions should be needed only when struct changes.

Structures can be even generated.

re access functions

this is little bit tricky. For example for mgvtbl workflow:

  • there should be registration functions per version returning opaque accessor
struct mgvtbl_internal * mgvtbl_registry_v510 (struct mgvtbl_v510 *);
struct mgvtbl_internal * mgvtbl_registry_v520 (struct mgvtbl_v520 *);
struct mgvtbl_internal * mgvtbl_registry_v530 (struct mgvtbl_v530 *);

such functions mey look like :

struct mgvtbl_internal * mgvtbl_registry_v510 (struct mgvtbl_v510 *data) {
   struct mgvtbl_v520 data_v520 = {
     .svt_get = data->svt_get,
     .svt_set = data->svt_set,
     .svt_added_in_520 = NULL
   };

  return mgvtbl_registry_v520 (&data_v520);
}
  • accessor functions, as long as they uses struct mgvtbl_internal * and returns same value, they don't need to change at all (mostly)

re advantage:

once symbol (C-API) is defined, it remains constant. So when you change behaviour (eg: introduces struct mgvtbl_v550) every XS will be compatible even if new structure is not aligned same way as old structure(s).

@bulk88
Copy link
Contributor

bulk88 commented Jan 1, 2025

re maintain struct:

no, there will be no structure maintenance, since structures are version locked and should not be changed. Also, new structure and functions should be needed only when struct changes.

Every not-FOSS commercial platform I've worked with, just makes the ''first'' field a length counter of some struct between app-core and customer-plugin-sh-lib. That struct can grow bigger for decades, any new-ext can register on old-core, old-ext can register on new-core.

Abstracting your code design out to tons of getter/setter methods everywhere, no pass by copy C/C++ args, no malloc blocks, no structs,, or assuming the compilers can escape analysis the glibc and perl binaries, as if its a high level Java or Web JS VM is very inaccurate.

But this thing might confuse people nowadays https://webperl.zero-g.net/ :-)

C/C++ don't really fold things backed by RO sh lib mem, after the first & operator gets that symbol's address, even with LTO is scared mprotect() will rewrite it and break the C VM spec. #define 0x1234 or static inline lto, is the only ways to const fold away, or de-refing memory, and making sure there are no fn calls for alot of lines of code, so the CC doesn't keep re-reading global (malloc/elf/dll) storage between each fn call.

Structures can be even generated.

re access functions

this is little bit tricky. For example for mgvtbl workflow:

  • there should be registration functions per version returning opaque accessor

can't optimize that, it will have multi eval problems

struct mgvtbl_internal * mgvtbl_registry_v510 (struct mgvtbl_v510 *);
struct mgvtbl_internal * mgvtbl_registry_v520 (struct mgvtbl_v520 *);
struct mgvtbl_internal * mgvtbl_registry_v530 (struct mgvtbl_v530 *);

why do we need bug compatibility on CPAN year-by-year? what happens after 10 years? thats alot of structs.

such functions mey look like :

struct mgvtbl_internal * mgvtbl_registry_v510 (struct mgvtbl_v510 *data) {
   struct mgvtbl_v520 data_v520 = {
     .svt_get = data->svt_get,
     .svt_set = data->svt_set,
     .svt_added_in_520 = NULL
   };
  return mgvtbl_registry_v520 (&data_v520);
}

impl/code/tokens look okay to me, ive seen this pattern in other C apps,

  • accessor functions, as long as they uses struct mgvtbl_internal * and returns same value, they don't need to change at all (mostly)

C is not Perl/Py/JS/Rust. C doesn't have JIT or mark & sweeping , user mode VM pages based on last CPU R/W time. Or tracing tilt bits.

C abstract virtual machine, doesn't allow function pointers, to ever optimize out. CC LTO can't go between 2 ELFs/DLLs/EXEs at CC time. And the CC doesnt exist at runtime to fold anything or rewrite machine code. https://metacpan.org/pod/FFI::TinyCC and RPerl https://metacpan.org/pod/C::TinyCompiler exist for runtime mach code,

  • the mgvtbl_registry_v520 looks too much, umm,
  • "Fortune 100s with premier product lines started in 1980s",
  • if 16 byte GUIDs are added to the register() func call prototype, this API will have decades of lifetime.... not sarcasm, but perl doesn't need IDLs or XML or JSON, so struct member U32 size or c func arg constant U32 0x05410100 is good enough and better than separate c func names

re advantage:

once symbol (C-API) is defined, it remains constant. So when you change behaviour (eg: introduces struct mgvtbl_v550) every XS will be compatible even if new structure is not aligned same way as old structure(s).

This was tried before

perl5/perlapi.h

Line 120 in 3b8a46d

#define PL_Argv (*Perl_IArgv_ptr(aTHX))

It was toxic. Thank goodness its in the cemetery. I used to -DPERL_CORE my Config.pm to get rid of that 5.8-era ABI-forever vtables for all my CPAN XS modules I installed. Remember a CC can't optimize out a function pointer, and with Perl CPP macros, you now have 25-100 function calls per line of C code after expansion, b/c multiple eval.

If anyone wants to see this type of execution, single step in C dbg, in disassembly view, any xsub from APITest.so/.dll.

Wrapping SvIV()/SvPV/PUSH() and friends into another XS library and "double buffering" a ABI forever XS middle module that is recompiled yearly, with "Shimed CPAN" that last 10 years makes more sense. Its probably on cpan already.

Another question, why is there perl XS code that CANT be recompiled? If FOSS/CPAN, obviously it can be recompiled. Private/biz XS code, I assume the dev/sys admin has the source code. I believe all Perl OSes have free-ish CCs now. If its over 30 mins to recomp core and the .dlls, its an EUMM optimization problem, not XS/type system problem.

Im thinking of closed source XS binaries that are "trapped" on a legacy server/desktop/legacy perl, and a person is trying to extend the life of that server/software stack, no source, no author, no support.

@happy-barney
Copy link
Author

happy-barney commented Jan 1, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants