Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added bounds checking on pinMode #302

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

BarryPSmith
Copy link

Currently if an invalid pin is passed to pinMode, it'll happily read beyond the ends of its arrays and corrupt a random piece of memory. This changes it so that it returns harmlessly in such a circumstance.

Stack corruption can be a nightmare to chase down.

@matthijskooijman
Copy link
Collaborator

Change looks good, but I wonder if this might be better to fix in digitalPinToPort (and maybe also digitalPinToBitMask)? Then you prevent the out-of-bounds read alltogether for all users of those macros, rather than just pinMode (e.g. also digitalWrite, but also sketches that use the macro)?

@BarryPSmith
Copy link
Author

Fair idea.

Since they them become more complex to include bounds checking, how about simultaneously changing the macros to inline functions?

@matthijskooijman
Copy link
Collaborator

Since they them become more complex to include bounds checking, how about simultaneously changing the macros to inline functions?

I am always happy to replace macros with proper functions where possible :-)

One caveat is that code that does #ifdef digitalPinToPort would break, but:

  • Google only shows only 5 mentions of this (here and here, no mentions of digitalPinToBitmask.
  • If needed, compatibility can be preserved using #define digitalPinToPort digitalPinToPort (but I think that won't really be needed here.

@BarryPSmith
Copy link
Author

Ok. Implemented it in the more upstream functions.

This required a change to pins_arduino.h too, since the arrays are extern and so sizeof doesn't work from arduino.h where the inline function is defined. With -flto, the linker optimises away the const digital_pin_count (at least I think it does: looking through the output of avr-readelf I couldn't find it).

If that's too much change, I'll can switch them back to multi-line macros.

// bounds checking on these functions because they return a pointer to an address that will be modified.
// An out of bounds write can result in stack corruption and fun to track-down errors.
//
static inline const uint8_t digitalPinToPort(const uint8_t P) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for using static and inline at the same time?

If a function is inlined, then there should be no symbol for it, so there should be no link time issue. Am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline is only a hint, the compiler can choose to ignore it and often does for large enough functions. If the static was not there, I believe the linker will (may?) complain about duplicate definitions of digitalPinToPort(). Not 100% sure, I'm not in front of a computer where I can check this. With link-time-optimization (--lto), I think any duplicate definitions will be optimized out.

Interestingly enough, it is the first const keyword in that line that is unnecessary, since uint8_t is returned by-value. The compiler is probably printing a warning, which will show up if warnings are turned on.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline is only a hint, the compiler can choose to ignore it and often does for large enough functions [..]

I know inline is just a hint, but seems the story is complicated by the fact there were implementations before standardization:

http://www.greenend.org.uk/rjk/tech/inline.html

After reading this, and after a recollection from some other discussion I had some time ago, to have a single stand-alone copy, according to the standard, there needs to be 1 and only 1 declaration of the inline function in a .c file, while the definition can be in .h.

Given the usage here, i guess static inline is the safest and simplest choice for this code.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, C++ handles the way to generate the inline standalone copy differently than C (definition/declaration rules are different).

I think I saw an explanation in some youtube video, but I can't recall which one.

Copy link
Author

@BarryPSmith BarryPSmith Dec 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I tried it with just inline, but I got linker errors complaining about unresolved symbols. I did some searching and it seemed static inline was the most similar to what one expects a macro to do (which is what this function is replacing) - i.e. only work if the definition is included in the current compilation unit.

(Side note: I'm very surprised about the existing comment in the code that macros are faster than inline functions. Perhaps -Os is making the compiler ignore the inline hint? Seems strange, in my projects the compiler inlines half of my larger functions straight int main.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, inline is not so much a hint to inline, but it changes the linkage of the symbol, telling the linker to not complain if multiple definitions are found (just drop all but one and use that one). This is relevant since when an inline function is not not inlined every time, a definition is also emitted into the .o file. Effectively, inline should be used whenever defining a function in a header file. Defining it in the header allows inlining, while the inline keyword prevents linker errors. Also note that with link-time-optimization, also functions in .cpp files can get inlined.

When using static, I believe that means that each .o file will get is own private definition of the function, which automatically will not conflict and not produce multiple defintion errors (so adding inline is effectively pointless here). This also means that if the function is not inlined in multiple .o files, each of these will get their own private, distinct definition of the function and you will end up with multiple definitions of the same function.

So, just inline rather than static inline is better, since then any duplicate definitions gets merged.

Originally I tried it with just inline, but I got linker errors complaining about unresolved symbols.

That seems weird, maybe we should investigate and solve those?

Side note: I'm very surprised about the existing comment in the code that macros are faster than inline functions. Perhaps -Os is making the compiler ignore the inline hint? Seems strange, in my projects the compiler inlines half of my larger functions straight int main.

Well, macros are necessarily inlined, while inline functions might not be if the compiler thinks this is better. If you really want to ensure inlining, you can use the always_inline attribute (but often it is better to let the compiler decide).

After reading this, and after a recollection from some other discussion I had some time ago, to have a single stand-alone copy, according to the standard, there needs to be 1 and only 1 declaration of the inline function in a .c file, while the definition can be in .h.

This seems all wrong. A declaration has no function body and is usually in a .h file (since it can be included from multiple .c/.cpp files without conflict), while a definition has a function body and is typically in the .c/.cpp file to prevent multiple definition errors (unless the function is inline, then it can appear in a .h file and be included from multiple .c/.cpp files).

BTW, C++ handles the way to generate the inline standalone copy differently than C (definition/declaration rules are different).

I was not aware of any such differences (but that might be me). There is a difference in calling convention for all functions (C++ mangles function names to include the argument types, to allow argument overloading), maybe that is what you recall?

Interestingly enough, it is the first const keyword in that line that is unnecessary, since uint8_t is returned by-value.

Indeed, that is a pointless const.

Copy link
Author

@BarryPSmith BarryPSmith Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, inline is not so much a hint to inline, but it changes the linkage of the symbol, telling the linker to not complain if multiple definitions are found (just drop all but one and use that one). This is relevant since when an inline function is not not inlined every time, a definition is also emitted into the .o file. Effectively, inline should be used whenever defining a function in a header file. Defining it in the header allows inlining, while the inline keyword prevents linker errors. Also note that with link-time-optimization, also functions in .cpp files can get inlined.

There is information that indicates that's not how gcc behaves under std=gnu11 (what the arduino build environment runs). See https://blahg.josefsipek.net/?p=529 . Look at the row 'inline' column 'Emit (C99)' in the table 5 paragraphs down. Do I misread the blog post, or is it wrong?

When using static, I believe that means that each .o file will get is own private definition of the function, which automatically will not conflict and not produce multiple defintion errors (so adding inline is effectively pointless here). This also means that if the function is not inlined in multiple .o files, each of these will get their own private, distinct definition of the function and you will end up with multiple definitions of the same function.

So, just inline rather than static inline is better, since then any duplicate definitions gets merged.

Originally I tried it with just inline, but I got linker errors complaining about unresolved symbols.

That seems weird, maybe we should investigate and solve those?

What are the parameters you would like to investigate? What is not solved? What behaviour do you believe exists that you would like to be different?

Here's how I understand it:

  • If it's only inline and the compiler decides not to inline the function, then it generates a linker error. I find it strange that the compiler would choose not to inline this function (bounds check + array lookup seems like even if optimising for size it's less code space than a function call). Perhaps there's something with LTO also happening here that the linker error occurs?
  • If it's inline static and the compiler can inline it everywhere, then no symbol is emitted. If it cannot inline it everywhere, then a symbol is emitted. I hope that LTO takes care to avoid multiple symbols, but I'm not sure - is this what you would like to investigate?
  • If it's inline in the header with a extern inline in a .c file, then there must be a symbol emitted. Does LTO have the option to not include that symbol? I don't know that either.

If my understanding is confused, I would appreciate clarification. There does seem to be some conflicting information around given the various inline behaviours.

A final option would be to get rid of inline entirely, and make it a normal function and hope that LTO inlines it wherever its used.

Also: If you remove the static, does it compile without problems for you? Seems a simple test; if it does then I'll try figure out why it won't compile for me.

After reading this, and after a recollection from some other discussion I had some time ago, to have a single stand-alone copy, according to the standard, there needs to be 1 and only 1 declaration of the inline function in a .c file, while the definition can be in .h.

This seems all wrong. A declaration has no function body and is usually in a .h file (since it can be included from multiple .c/.cpp files without conflict), while a definition has a function body and is typically in the .c/.cpp file to prevent multiple definition errors (unless the function is inline, then it can appear in a .h file and be included from multiple .c/.cpp files).

Here is a gnu.org article that suggests the definition should be in the header file, and the "declaration" in the c file (there were a few other sources similar, but I didn't write them down): https://www.gnu.org/software/gnulib/manual/html_node/extern-inline.html

I believe this might be what you are advocating, but if so are you certain that it still gives the linker the freedom to strip the symbol if it is inlined everywhere? Also, do you think the added complexity of having to have an extern inline declaration somewhere (I assume wiring.c would make the most sense) is worth it to save code space in the edge case where the function isn't inlined and used in multiple translation units?

Copy link
Author

@BarryPSmith BarryPSmith Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the exact linker errors if I remove static:

c:/avr-gcc-9.2.0-x64-mingw/bin/../lib/gcc/avr/9.2.0/../../../../avr/bin/ld.exe: C:\Users\BARRY_~1\AppData\Local\Temp\all.obj.5rYYTU.ltrans0.ltrans.o: in function `pinMode':
  <artificial>:(.text.pinMode+0xa): undefined reference to `digitalPinToBitMask'
  c:/avr-gcc-9.2.0-x64-mingw/bin/../lib/gcc/avr/9.2.0/../../../../avr/bin/ld.exe: <artificial>:(.text.pinMode+0x12): undefined reference to `digitalPinToPort'
  c:/avr-gcc-9.2.0-x64-mingw/bin/../lib/gcc/avr/9.2.0/../../../../avr/bin/ld.exe: C:\Users\BARRY_~1\AppData\Local\Temp\all.obj.5rYYTU.ltrans0.ltrans.o: in function `digitalRead':
  <artificial>:(.text.digitalRead+0x8): undefined reference to `digitalPinToTimer'
  c:/avr-gcc-9.2.0-x64-mingw/bin/../lib/gcc/avr/9.2.0/../../../../avr/bin/ld.exe: <artificial>:(.text.digitalRead+0x10): undefined reference to `digitalPinToBitMask'
  c:/avr-gcc-9.2.0-x64-mingw/bin/../lib/gcc/avr/9.2.0/../../../../avr/bin/ld.exe: <artificial>:(.text.digitalRead+0x18): undefined reference to `digitalPinToPort'
  c:/avr-gcc-9.2.0-x64-mingw/bin/../lib/gcc/avr/9.2.0/../../../../avr/bin/ld.exe: C:\Users\BARRY_~1\AppData\Local\Temp\all.obj.5rYYTU.ltrans0.ltrans.o: in function `digitalWrite':
  <artificial>:(.text.digitalWrite+0xc): undefined reference to `digitalPinToTimer'
  c:/avr-gcc-9.2.0-x64-mingw/bin/../lib/gcc/avr/9.2.0/../../../../avr/bin/ld.exe: <artificial>:(.text.digitalWrite+0x14): undefined reference to `digitalPinToBitMask'
  c:/avr-gcc-9.2.0-x64-mingw/bin/../lib/gcc/avr/9.2.0/../../../../avr/bin/ld.exe: <artificial>:(.text.digitalWrite+0x1c): undefined reference to `digitalPinToPort'
C:\Users\Barry\collect2.exe : error : ld returned 1 exit status

I'm not running the Arduino IDE, but use the Arduino libraries. As such, perhaps I see this error because I don't run GCC exactly the same (but I have looked at what the Arduino IDE does and tried to match it). The options I pass to GCC are:

CFLAGS= -mcall-prologues -mmcu=$(MCU) -Os -fdata-sections -ffunction-sections -fno-exceptions -flto -std=c++17 -g

It still emits the same errors if I change it to std=gnu11 (to match the Arduino IDE).

Copy link
Collaborator

@matthijskooijman matthijskooijman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the inline comments, this change looks good to me. Adding a digital_pin_count variable seems a good way to handle the sizeof issue. This would normally prevent some optimizations from happening, but with LTO, I think the compiler will be able to optimize away the pin number check when used with constant pin numbers.

cores/arduino/Arduino.h Outdated Show resolved Hide resolved
@sauttefk
Copy link

sauttefk commented Dec 13, 2019

I oppose this change, as it increases the time needed to read or write a pin.
Many projects are relying on the existing timing and this change may break these projects!
I don't see any reason why doing a range check and returning 0 on a non-existing pin would solve any problems you have with a broken software.

variants/circuitplay32u4/pins_arduino.h Outdated Show resolved Hide resolved
cores/arduino/wiring_digital.c Show resolved Hide resolved
@BarryPSmith
Copy link
Author

BarryPSmith commented Dec 13, 2019

@sauttefk I think you will find that this doesn't slow down any properly written software. Often the pin is chosen statically, and so the compiler is able to optimise away this check as either always true or always false. See the end of this post for proof of that.

Here's how it can prevent a problem with broken software:

  • Usually, one calls pinMode early in the initialisation
  • If the parameter passed to pinMode is out of range, then digitalPinToPort will retrieve a random piece of memory
  • Then pinMode will pass that random byte to portModeToRegister and portOutputRegister, which will use it as an address and again retrieve random pieces of memory
  • The random addresses returned by portModeToRegister and portOutputRegister will be modified.
  • The function returns and the program continues operating

This issue is that often there is no obvious indication at that point that pinMode has done anything wrong. Instead, later on another part of the software uses the address that has been changed, and the program crashes. Trying to debug an issue like this is hard because you will find many useless clues: You can change a completely unrelated part of the program, and suddenly it breaks or starts working again because the random pieces of memory retrieved by pinMode are now different.

With this change, pinMode and other functions that manipulate pins are only able to change the memory they are supposed to. Given invalid arguments, they return safely. If the program is broken by passing an invalid digital pin, that will show up as just those functions not behaving as expected, instead of the entire program becoming insanely erratic.


This program:

void setup() {
  // put your setup code here, to run once:
  pinMode(5, OUTPUT);
}

void loop() {
  // put your main code here, to run repeatedly:
  static bool flag = true;
  flag = !flag;
  digitalWrite(5, flag);
}

Produces this assembly for the first part of digitalWrite (17 instructions)

void digitalWrite(uint8_t pin, uint8_t val)
{
	uint8_t timer = digitalPinToTimer(pin);
 24e:	29 ea       	ldi	r18, 0xA9	; 169
 250:	e2 2e       	mov	r14, r18
 252:	20 e0       	ldi	r18, 0x00	; 0
 254:	f2 2e       	mov	r15, r18
	
	setup();
    
	for (;;) {
		loop();
		if (serialEventRun) serialEventRun();
 256:	30 e0       	ldi	r19, 0x00	; 0
 258:	c3 2e       	mov	r12, r19
 25a:	30 e0       	ldi	r19, 0x00	; 0
 25c:	d3 2e       	mov	r13, r19
 25e:	20 91 00 01 	lds	r18, 0x0100	; 0x800100 <__data_start>
 262:	2b 25       	eor	r18, r11
 264:	20 93 00 01 	sts	0x0100, r18	; 0x800100 <__data_start>
 268:	f7 01       	movw	r30, r14
 26a:	94 91       	lpm	r25, Z
	uint8_t bit = digitalPinToBitMask(pin);
 26c:	fe 01       	movw	r30, r28
 26e:	84 91       	lpm	r24, Z
	uint8_t port = digitalPinToPort(pin);
 270:	f8 01       	movw	r30, r16
 272:	34 91       	lpm	r19, Z
	volatile uint8_t *out;

	if (port == NOT_A_PIN) return;

And with the change, produces this (17 instructions):

static inline uint8_t digitalPinToTimer(const uint8_t P) {
  if (P >= digital_pin_count) {
    return NOT_A_PIN;
  } else {
    return pgm_read_byte( digital_pin_to_timer_PGM + P );
 24e:	9f e9       	ldi	r25, 0x9F	; 159
 250:	e9 2e       	mov	r14, r25
 252:	90 e0       	ldi	r25, 0x00	; 0
 254:	f9 2e       	mov	r15, r25
	
	setup();
    
	for (;;) {
		loop();
		if (serialEventRun) serialEventRun();
 256:	20 e0       	ldi	r18, 0x00	; 0
 258:	c2 2e       	mov	r12, r18
 25a:	20 e0       	ldi	r18, 0x00	; 0
 25c:	d2 2e       	mov	r13, r18
 25e:	20 91 00 01 	lds	r18, 0x0100	; 0x800100 <__data_start>
 262:	2b 25       	eor	r18, r11
 264:	20 93 00 01 	sts	0x0100, r18	; 0x800100 <__data_start>
 268:	f7 01       	movw	r30, r14
 26a:	84 91       	lpm	r24, Z
    return pgm_read_byte( digital_pin_to_bit_mask_PGM + P );
 26c:	f8 01       	movw	r30, r16
 26e:	94 91       	lpm	r25, Z
    return pgm_read_byte( digital_pin_to_port_PGM + P );
 270:	fe 01       	movw	r30, r28
 272:	34 91       	lpm	r19, Z
	uint8_t timer = digitalPinToTimer(pin);
	uint8_t bit = digitalPinToBitMask(pin);
	uint8_t port = digitalPinToPort(pin);
	volatile uint8_t *out;

	if (port == NOT_A_PIN) return;

(In optimised assembly often some lines of code are left out, this is the case in the code posted above where it claims its the new function instead of digitalWrite).

Same instruction count, look like the same sort of instructions; I think it's fair to assume they will take the same amount of time to execute. Note how there are no compares or breaks: the check has been optimised out.

Is the exact timing of digitalWrite documented anywhere, that it cannot change? Writing code to take advantage of undocumented features is generally asking for it to break, writing it to take advantage of the exact undocumented timing of a function even more so. If someone needs accuracy within a few cycles they should be writing the registers directly (or even using inline ASM).

}
static inline const uint8_t digitalPinToBitMask(const uint8_t P) {
if (P > digital_pin_count) {
return NOT_A_PIN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, it is not clear whether NOT_A_PIN is an invalid bit mask, or it's a valid mask that is returned for an invalid pin. Can you add a comment?

Copy link
Author

@BarryPSmith BarryPSmith Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you reviewing the latest commit? That should be a >=.

It's following the same convention already in place in pinMode, where the returned value of digitalPinToPort is compared against NOT_A_PIN to see if the input was a pin.

Do you think that it is not obvious from the code what the situation is, that a comment is needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was: the semantics of the NOT_A_PIN being used for multiple concepts that was confusing me.

The digitalPinToPort() function uses the digital_pin_to_port_PGM[] array. If an invalid pin is given, I expected an invalid value to be returned, which I expected to be named NOT_A_PORT_NUMBER. It seems like current valid port numbers (PA - PL) are > 0 so NOT_A_PORT_NUMBER can be 0. But it seems like only an accident that NOT_A_PORT_NUMBER is the same as NOT_A_PIN.

(The term "PORT" seems to be overloaded in pins_arduino.h. Sometimes it means "port number", sometimes it means a pointer. There is a NOT_A_PORT constant already defined, but it is used as a NULL or nullptr pointer as far as I can tell.)

The digitalPinToBitMask() function uses the digital_pin_to_bit_mask_PGM[] array. When an invalid pin is given, I expected a NOT_A_MASK value to be returned, a value that is not a valid bit mask, which I guess can be 0. It seems like it's an accident that a NOT_A_PIN is equal to NOT_A_MASK. Returning a NOT_A_PIN was confusing to me.

Similarly, the digitalPinToTimer() function uses the digital_pin_to_timer_PGM[] array. If an invalid pin is given, I expected an NOT_A_TIMER value to be returned, not a NOT_A_PIN. Again, it seems like an accident that NOT_A_PIN happens to be the same as NOT_A_TIMER. Now, there's already a NOT_ON_TIMER defined, and it's defined to be 0. So if you define NOT_A_TIMER to be 0, we can no longer distinguish between the 'NOT_A' and the 'NOT_ON'. Which raises the question, should `NOT_A_TIMER' be defined to be 255 instead?

These 3 functions are returning 3 conceptually separate things, and I got confused why all of them are returning a NOT_A_PIN. Anyway, if you choose not to create 3 new symbols (NOT_A_PORT_NUMBER, NOT_A_MASK, and NOT_A_TIMER), then I think some comments in the code would help explain why NOT_A_PIN can be used in all 3 cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please familiarise yourself with how these functions are currently used and implemented.They have no documentation of their API, so their existing use defines the API. wiring_digital.c and the various pins_arduino.h are good places to look. You will see that the new implementations are consistent with the existing API, as I explained above.

You are requesting that the existing API change. That is out of scope of this pull request. Please create a separate pull request or issue. I do not feel that a comment explaining that a function conforms to the rest of the (undocumented, uncommented) API is necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's even more important to add comments to describe your assumptions when you are making changes on top of undocumented and uncommented API.

I have reviewed the usage of digitalPinToPort(), digitalPinToBitMask(), and digitalPinToTimer() in the following files (wiring_pulse.c, wiring_digital.c, Tone.cpp, SoftwareSerial.cpp, SPI.cpp). I support the use of NOT_A_PIN in digitalPinToPort(). But I disagree with its usage in digitalPinToBitMask() and digitalPinToTimer().

As far as I can tell, NOT_A_PIN means "an invalid port number for an invalid pin". If the symbol is overloaded to represent the concept of "an invalid mask for an invalid pin", and "an invalid timer for an invalid pin", I suspect that this will cause unnecessary confusion for (potentially hundreds of) programmers in the future who will try to deduce the meaning of NOT_A_PIN, without any explanatory comments.

If you choose to ignore my feedback and continue to use NOT_A_PIN in 3 different contexts, I request that you add comments to that part of the code that explains that it is being overloaded to mean 3 different things. I think this is a reasonable request.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For digitalPinToPort, the function conceptually returns:

  • For a valid pin with a port associated with it, the port
  • For all other cases, an error code about the pin. Exemplified by NOT_A_PIN, and the checks in wiring_digital.c

For digitalPinToTimer, the function conceptually returns:

  • For a valid pin with an associated timer, the timer associated with the pin.
  • For all other cases, an error code about the pin. Exemplified by NOT_ON_TIMER (if it were an error code about the return value, it would be as you suggested NOT_A_TIMER)

For digitalPinToBitmask, the function conceptually returns:

  • For a valid pin with an associated bitmask, the bitmask of the pin
  • For invalid pins, the return value is undefined (there are no coded cases of what it returns).
  • Given lack of an example, I think it is fair to deduce that digitalPinToBitmask should follow the same rules as the other digitalPinTo* functions.

So, for digitalPinToTimer, when passed a value that is not associated with a pin, the correct error code to return is a description of the input value. You cannot say that pin 215 is not on a timer, because pin 215 does not exist. So the correct error code is NOT_A_PIN. Given that NOT_ON_TIMER and NOT_A_PIN have the same value, existing code will work without having to check for the new possible return concept.

The same logic applies to digitalPinToBitmask. Returning zero might be acceptable too, but that would be inconsistent with the other similarly named functions.

Unfortunately it appears that the writer of port_to_input_PGM in the gemma and mega variants abuse NOT_A_PIN and use it to return when queries for the register of a port. I cannot understand what concept they are going for here, and given that portInputRegister never has its output checked, this appears to be a bug. (In most cases, when that function returns NOT_A_PIN the uC will attempt to read or write address 0x0000, which should cause a reset. At least that's better behaviour than pinMode or digitalWrite without this patch).

So:

  • if you want me to comment the return values here, then I'm solidifying what I induce about the API of these functions. Which requires fixing pins_arduino.h for the mega and gemma.
  • if I introduce new values, I'm changing the API and all dependent functions must be checked that they will still work. Plus that would just be inconsistent.

Both of those options are out of scope for this PR and more than I'm prepared to do. As I say, please deal with the issue in a separate PR.

Copy link
Contributor

@bxparks bxparks Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the intended limited scope of this PR. But your changes to digitalPinToBitMask() and digitalPinToTimer() went slightly beyond that intended scope. Those changes were not necessary for this PR, so I think my inquiry regarding those changes is legitimate.

The problem that you found in variants/gemma/pins_arduino.h and variants/mega/pins_arduino.h with their incorrect use of NOT_A_PIN is exactly the sort of problem that I want to avoid in this PR. It is obvious that those 2 files should be using NOT_A_PORT instead, to be consistent with all the others. And I agree it's not in the scope of this PR to fix those 2 files.

The return type of digitalPinToPort() is a port number. Therefore, NOT_A_PIN is port number type. It is a sentinel value that represents "an invalid port number representing an invalid pin". All current uses of NOT_A_PIN (except for variants/gemma/pins_arduino.h and variants/mega/pins_arduino.h) are consistent with it being a port number.

The return type ofdigitalPinToBitMask() is a bit mask, not a port number. I currently see no example of the digital_pin_to_bit_mask_PGM[] array using the NOT_A_PIN constant as a sentinel value to represent the concept of: "an invalid bit mask representing an invalid pin". This PR adds a new semantic meaning to NOT_A_PIN which is not appropriate, because the type of NOT_A_PIN is a port number not a bit mask. (They both happen to be a uint8_t, but that does not mean they can reuse constants. It would be like having a int getLength() function returning an INVALID_SECONDS which happens to be an int.)

Finally, the return type of digitalPinToTimer() function seems to be an integer enumeration of the various (Timer x Channel) combo. (A modern type-safe version would probably use a C++ scoped enum type (i.e. enum class). A timer value of NOT_ON_TIMER (a value of 0) is actually a valid value for this function, and it does not mean "an invalid timer value representing invalid pin". One can verify that NOT_ON_TIMER is a valid value because it is used in the analogWrite() function in cores/arduino/wiring_analog.c, where the last part of the switch statement is:

      case NOT_ON_TIMER:
      default:
        if (val < 128) {
          digitalWrite(pin, LOW);
        } else {
          digitalWrite(pin, HIGH);
        }

If digitalPinToTimer() returns a 0, the existing code continues to write into the pin, instead of interpreting the 0 to mean "invalid pin". Since NOT_A_PIN is also a 0, which overlaps with NOT_ON_TIMER, it cannot be correct for digitalPinToTimer() to return NOT_A_PIN for an invalid pin.

At this point, the simplest fix I can see that will implement the intended scope of this PR is to remove the checks for (P >= digital_pin_count) in digitalPinToBitMask() and digitalPinToTimer(), while keeping the check in digitalPinToPort(), which I agree is correct. In other words:

static inline uint8_t digitalPinToBitMask(const uint8_t P) {
   return pgm_read_byte( digital_pin_to_bit_mask_PGM + P );
}
static inline uint8_t digitalPinToTimer(const uint8_t P) {
  return pgm_read_byte( digital_pin_to_timer_PGM + P );
}

I cannot see how returning the NOT_A_PIN constant is correct for these 2 functions. I think the correct solution is to define 2 new constants, a INVALID_MASK_FOR_INVALID_PIN and a INVALID_TIMER_FOR_INVALID_PIN. But I agree it is beyond the scope of this PR to add 2 new constants, and add the appropriate checks for these sentinel values in the calling code.

//
static inline const uint8_t digitalPinToPort(const uint8_t P) {
if (P > digital_pin_count) {
return NOT_A_PIN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me whether NOT_A_PIN is an invalid port number, or whether it is a valid port number that is returned for an invalid pin. Can you add a comment to make that more clear?

Copy link
Author

@BarryPSmith BarryPSmith Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you reviewing the latest commit? That should be a >=.

It's following the same convention already in place in pinMode, where the returned value of digitalPinToPort is compared against NOT_A_PIN to see if the input was a pin.

Do you think that it is not obvious from the code what the situation is, that a comment is needed?

}
static inline const uint8_t digitalPinToTimer(const uint8_t P) {
if (P > digital_pin_count) {
return NOT_A_PIN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, is NOT_A_PIN an invalid Timer value, or the default value returned when the pin is not valid? Add comment?

Copy link
Author

@BarryPSmith BarryPSmith Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you reviewing the latest commit? That should be a >=.

It's following the same convention already in place in pinMode, where the returned value of digitalPinToPort is compared against NOT_A_PIN to see if the input was a pin. It doesn't return NOT_ON_TIMER because it's not a pin that's not on a timer, but it's not a pin at all. (convenient enough, by coincidence NOT_ON_TIMER and NOT_A_PIN have the same value so sloppy programming where only the former is checked will still pick it up.)

Do you think that it is not obvious from the code what the situation is, that a comment is needed?

@bxparks
Copy link
Contributor

bxparks commented Dec 16, 2019

One more high-level feedback: I have tested this PR on a real-life project that uses digitalPinRead() for debouncing and reading 2 momentary buttons on 2 pins, and I see a 14-byte increase in program size in flash:

Original (IDE 1.8.9):

Sketch uses 28546 bytes (99%) of program storage space. Maximum is 28672 bytes.
Global variables use 1402 bytes (54%) of dynamic memory,
leaving 1158 bytes for local variables. Maximum is 2560 bytes.

This PR:

Sketch uses 28560 bytes (99%) of program storage space. Maximum is 28672 bytes.
Global variables use 1402 bytes (54%) of dynamic memory,
leaving 1158 bytes for local variables. Maximum is 2560 bytes.

A 14-byte program size increase is minimal in return for run-time bounds checking, so I remain supportive of the goal of this PR. My only objection is the overloaded use of the NOT_A_PIN constant as noted above.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@eddyp
Copy link

eddyp commented Apr 26, 2021

My only objection is the overloaded use of the NOT_A_PIN constant as noted above.

I agree, while also understand @BarryPSmith's position on this.
The point here is that the return value needs 2 different types of information:

  1. an error case WRT the input and
  2. the actual meaningful type for the specific API

You're saying 1 should be represented as the "type" of the function, while @BarryPSmith is saying he's using a "distinct" type which has relation to the input type (i.e. the pin).

I think a clearer resolution is if the error case constant changes name and its name does not look like a pin, but as an error:

#define ERROR_NOT_A_PIN 0

or

#define ERROR_INVALID_INPUT

KurtE added a commit to KurtE/ArduinoCore-renesas that referenced this pull request Aug 10, 2023
As I mentioned in the forum thread:
https://forum.arduino.cc/t/apis-like-digitalwrite-who-use-g-pinc-cfg-should-do-bounds-checking/1156322

I believe that many of the simple functions should have some form of parameter testing.  For example: pinMode(100, OUTPUT);
Should fail instead of trying to use random garbage off the end of the array to pass down to the next level.

As @per1234 mentioned on the forum thread.  This has bounced around for years:
arduino/ArduinoCore-avr#302

So decided to at least try to do it for a few of the APIs that have this issue.  Most of the other references to this array appear to either check or are driven by pin information in definded in the variant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants