Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fastcode attribute to delay_ns #498

Merged
merged 2 commits into from
Nov 17, 2020

Conversation

mcbridejc
Copy link
Contributor

See issue #495.

I found the delay_ns to be off by scale factor of 2 on stm32f303. Putting the code into fast code section fixes this, at least for this part.

@salkinium
Copy link
Member

This will only work on devices with a single-cycle access instruction cache, like the F3 (ICCM), F7 (I-Cache) and H7 I-Cache).
For all other devices the delay_ns_per_loop is likely still wrong.
I'm going to keep this open until I have the time to do some measurements.

@mcbridejc
Copy link
Contributor Author

mcbridejc commented Nov 17, 2020

Since I had a STM32G431 board handy, I decided to do a quick test on that. Here, the delay_ns function works roughly as expected, both with and without the modm_fastcode attribute enabled.

But I think I found another issue with the delay_ns function. In my test code, I happened to also use a delay_us as well:

while(true) {
    GpioA0::setOutput(true);
    modm::delay_ns(5000);
    GpioA0::setOutput(false);
    modm::delay_us(10);
}

With a delay_us value of 10, I got a delay of 180us. With any other value I tried, I got the expected delay. Debugger showed the argument was in fact passed as 180, when I tried to pass 10. Inspecting disassembly, I found that when I used 10, the compiler passed the value of register r6 as first argument (mov r0, r6). Any other value resulted in loading a literal, e.g. movs r0, #9. After some hair pulling, I realized this is because of the assembly block in the delay_ns function above. Because the argument value matched the value loaded into the register there (10) it re-used it, but actually it had been multiplied by 18 in the inline asm.

From https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#InputOperands:

Warning: Do not modify the contents of input-only operands (except for inputs tied to outputs). The compiler assumes that on exit from the asm statement these operands contain the same values as they had before executing the statement. It is not possible to use clobbers to inform the compiler that the values in these inputs are changing. One common work-around is to tie the changing input variable to an output variable that never gets used. Note, however, that if the code that follows the asm statement makes no use of any of the output operands, the GCC optimizers may discard the asm statement as unneeded (see Volatile).

I'm about to push a change that fixes it, by adding an output to the asm block.

Copy link
Member

@salkinium salkinium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent catch!

GCC requires any input register that is modified be tied to an output,
or optimizations may fail.
@salkinium
Copy link
Member

salkinium commented Nov 17, 2020

Will merge next after CI passes. (Rebased on develop and reworded the commit messages slightly)

@salkinium salkinium merged commit 19a00e2 into modm-io:develop Nov 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants