towards less macros in the C/C++ codegen output #21806

heterodoxic · 2023-05-07T08:29:03Z

While working on clearly far more important C/C++ codegen issues, I kept wondering if we could not simply emit C's and C++'s "native" types in the cases of NIM_BOOL and NIM_CHAR nowadays, instead of using macros/redundant typedefs possibly obscuring semantics for the C/C++ code interpreter (be it a person or an editor/IDE).

I did not encounter any problems swapping NIM_CHAR for char with both C and C++, but I might be lacking foresight here.

For NIM_BOOL → bool the story is a bit different: I suggest a slight modification to what was accomplished here #13798: we define bool, true/false macros if C89 support is actually (still?!) wanted. In most other cases, when C code is to be output, why not rely on the ISO C way of having stdbool.h take care of what true and false actually represent since C99 (see https://github.com/gcc-mirror/gcc/blob/master/gcc/ginclude/stdbool.h for reference)?

juancarlospaco · 2023-05-07T20:12:58Z

C99 is interesting as it allows optimizations like -fmerge-all-constants, -fsingle-precision-constant, etc.

arnetheduck · 2023-05-07T20:38:13Z

The macros help port Nim to more unusual C compilers - since one of the core features of Nim is to be able to do this, it seems counterproductive to remove - the C code is after all not really meant to be read by humans.

heterodoxic · 2023-05-08T04:30:43Z

C99 is interesting as it allows optimizations like -fmerge-all-constants, -fsingle-precision-constant, etc.

Yes, and this PR applies no changes to C89/C99 compatibility, it merely tries to make clear: we're emitting a standard(-ish, see #21806 (comment)) C/C++ bool and char, and in the edge case of using pre-C99: bool remains an unsigned char ~~macro~~typedef.

heterodoxic · 2023-05-08T04:49:56Z

The macros help port Nim to more unusual C compilers - since one of the core features of Nim is to be able to do this, it seems counterproductive to remove - the C code is after all not really meant to be read by humans.

I absolutely agree with the first part of your statement, and this is why the macros/typedefs remain where still needed, but are more transparently resolved to standard types of the respective backend where achievable.

Regarding your latter assessment: as long as Nim remains based on translating its semantics to another language's semantics, emitting its own constructs to either C, C++, or JS code, I'd say there's still a value in trying to achieve a readability by humans (and C/C++ code editors/IDEs still stumble over too many macros!) where it seems possible: bugs may occur on Nim language level, they might also happen in the process of expressing Nim in the backend language. They might occur in the interface of those semantic realms as well. Having a debuggability, for which I claim retaining readability is one aspect of it, is essential for all those layers. And given Nim's relatively comfortable C/C++ interop, problems may arise all over the code base: why not help us spot those places more easily, if it seems feasible enough?

Araq · 2023-05-08T10:50:30Z

Does the C standard guarantee that sizeof(bool) == 1, cause Nim does guarantee that, sizeof(bool) == sizeof(byte).

heterodoxic · 2023-05-08T11:36:38Z

Does the C standard guarantee that sizeof(bool) == 1, cause Nim does guarantee that, sizeof(bool) == sizeof(byte).

Good point, in fact, it does not, even for C99+ AFAIK: if you look up p. 33 in this ISO C99 draft it's stated that

The type _Bool and the unsigned integer types that correspond to the standard signed integer types are the standard unsigned integer types.

So C99+'s bool->_Bool should resolve to an unsigned integer. You can actually find it being used that way (e.g. https://stackoverflow.com/questions/1921539/using-boolean-values-in-c).

By compiler-specific implementation though, _Bool will regularly take the space of a byte (just fired up Compiler Explorer and went to several oldish GCC/Clang/MSVC instances checking this...). So I keep the way it's already been for Nim: resorting to the compiler's probable _Bool implementation as a byte, and ensuring bool is represented as a byte/unsigned char, if necessary by an explicit typedef for pre-C99.

juancarlospaco · 2023-05-09T19:07:27Z

Which C compiler can not compile stdbool.h ?, even TCC take it. 🤔

heterodoxic · 2023-05-10T09:04:35Z

Which C compiler can not compile stdbool.h ?, even TCC take it. 🤔

It's about the existence of stdbool.h, which came to life with C99, AFAIK. Frankly, I wouldn't know how relevant C89 support is nowadays (e.g. highlighting this info snippet, but it's from 6 years ago), without conducting a thorough investigation. I am merely adhering to this stance #13798 (comment) as there seems to be no harm in still supporting it (as long as we enforce bool to be 1 byte), and it might actually still be needed for some compilers, as @arnetheduck pointed out.

But all of this is beyond the scope of my PR, I'm not making any pervasive changes in this regard, the point is simply this: resolving

for C/C++ codegen that NIM_CHAR is actually always a C/C++'s char (1 byte), as specified by the respective standards (encompassing C, C++, and Nim)
for C89+/C++ codegen that NIM_BOOL is really a 1 byte bool, as is required by Nim (towards less macros in the C/C++ codegen output #21806 (comment))

heterodoxic · 2023-05-11T09:07:26Z

To put it crudely: this PR is about emitting less bytes for the C/C++ codegen, while offering more precise semantics there, and still retaining C89+/C++ backwards compatibility. 😃

Araq · 2023-05-27T05:43:11Z

lib/nimbase.h

+#  if (defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901)
+#    include <stdbool.h>
+#  else
+#    define bool  unsigned char


This is confusingly not using the C bool header. I know that I told you not to use it (and you really should not), but NIM_TRUE and NIM_FALSE are not misleading to the casual reader. So let's keep them.

First of all: thanks for having another look at this!

If we still want to support C89, we can't assume the presence of the bool header, which is part of the ISO C99 spec, unfortunately (cf. https://en.wikibooks.org/wiki/C_Programming/stdbool.h, https://stackoverflow.com/a/1921557, https://stackoverflow.com/a/1608350 for a cursory glance).

The strategy taken by C89 programmers to work with some sort of fake bool type seems to be the one I take: typedefing bool, and defining true, and false as macros.

For a pretty pedantic C89 support, our current definitions are broken anyway, see https://github.com/nim-lang/Nim/blob/devel/lib/nimbase.h#L305-L321: we define NIM_TRUE/NIM_FALSE as true/false, but they don't exist in "pure" C89 (I guess merely as Compiler extensions/hacks).

Given that, do we actually still want to retain C89 support, and do you still prefer keeping NIM_TRUE/NIM_FALSE?

true is C's true which maps to _Bool or whatever it's called which is not necessarily of size 1. Hence we avoid _Bool and then our own true would be confusing.

So I vote for NIM_TRUE/NIM_FALSE regardless of our increasingly broken C89 support.

Araq · 2023-06-11T19:15:43Z

There is nothing in the diff left that I consider an improvement: NIM_FALSE and NIM_TRUE must remain and NIMCHAR could eventually be mapped to char8_t.

towards less macros in the C/C++ codegen output

65872ea

heterodoxic force-pushed the devel branch from 6cb7cb3 to 774c08b Compare May 7, 2023 09:36

update test

7e39327

heterodoxic force-pushed the devel branch from 3ed925e to 7e39327 Compare May 7, 2023 18:07

heterodoxic mentioned this pull request May 26, 2023

Clean nimbase #21927

Merged

Araq reviewed May 27, 2023

View reviewed changes

Araq closed this Jun 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

towards less macros in the C/C++ codegen output #21806

towards less macros in the C/C++ codegen output #21806

heterodoxic commented May 7, 2023 •

edited

Loading

juancarlospaco commented May 7, 2023

arnetheduck commented May 7, 2023

heterodoxic commented May 8, 2023 •

edited

Loading

heterodoxic commented May 8, 2023 •

edited

Loading

Araq commented May 8, 2023

heterodoxic commented May 8, 2023 •

edited

Loading

juancarlospaco commented May 9, 2023

heterodoxic commented May 10, 2023 •

edited

Loading

heterodoxic commented May 11, 2023

Araq May 27, 2023 •

edited

Loading

heterodoxic May 27, 2023 •

edited

Loading

Araq May 27, 2023

Araq May 27, 2023

Araq commented Jun 11, 2023

towards less macros in the C/C++ codegen output #21806

towards less macros in the C/C++ codegen output #21806

Conversation

heterodoxic commented May 7, 2023 • edited Loading

juancarlospaco commented May 7, 2023

arnetheduck commented May 7, 2023

heterodoxic commented May 8, 2023 • edited Loading

heterodoxic commented May 8, 2023 • edited Loading

Araq commented May 8, 2023

heterodoxic commented May 8, 2023 • edited Loading

juancarlospaco commented May 9, 2023

heterodoxic commented May 10, 2023 • edited Loading

heterodoxic commented May 11, 2023

Araq May 27, 2023 • edited Loading

Choose a reason for hiding this comment

heterodoxic May 27, 2023 • edited Loading

Choose a reason for hiding this comment

Araq May 27, 2023

Choose a reason for hiding this comment

Araq May 27, 2023

Choose a reason for hiding this comment

Araq commented Jun 11, 2023

heterodoxic commented May 7, 2023 •

edited

Loading

heterodoxic commented May 8, 2023 •

edited

Loading

heterodoxic commented May 8, 2023 •

edited

Loading

heterodoxic commented May 8, 2023 •

edited

Loading

heterodoxic commented May 10, 2023 •

edited

Loading

Araq May 27, 2023 •

edited

Loading

heterodoxic May 27, 2023 •

edited

Loading