Skip to content

[2.0.x] Use 'float' instead of 'double' maths#11178

Merged
thinkyhead merged 4 commits intoMarlinFirmware:bugfix-2.0.xfrom
ejtagle:misc-fixes
Jul 6, 2018
Merged

[2.0.x] Use 'float' instead of 'double' maths#11178
thinkyhead merged 4 commits intoMarlinFirmware:bugfix-2.0.xfrom
ejtagle:misc-fixes

Conversation

@ejtagle
Copy link
Contributor

@ejtagle ejtagle commented Jul 1, 2018

  1. Replace 'double' with 'float'. Under AVR both types map to the same (float). On 32bit CPUs, they don't. The extra precision double costs more time, and is unnecessary. (On FPU enabled 32bit-boards, double is not accelerated but float is.)
  2. Remove Quake III fast inverse square root. Profiling shows it is faster to simply call sqrtf (because it's implemented in assembler). The same profiling showed that if multiplication is 1x, division is 2x, square root is 4x, so it makes sense to replace them.
  3. Replace several divisions by multiplications by the reciprocal, when the divisor can be reused (in vector normalization).
  4. Optimize Delta forward kinematics formulae (please, @thinkyhead , check if the optimizations are OK, just to be sure... I don't have a delta to test them)
  5. Replace some sqrt by division.

@p3p
Copy link
Member

p3p commented Jul 2, 2018

A quick perusal shows there are still a fair few double literals in areas you modified, there are a lot everywhere else in Marlin but I thought I'd mention it as you updated those parts, I had the choice between disabling double literals in the compiler or going through Marlin and changing them all... I disabled double literals.

@ejtagle
Copy link
Contributor Author

ejtagle commented Jul 2, 2018

@p3p : I know there are still doubles. Specifically in the serial.print() routines. I didn´t remove them because that would break inheritance from the Stream class.

But, those functions aren´t used, so i hope the linker is able to remove them.

If there is consensus, i can also remove all those functions 👍

@p3p
Copy link
Member

p3p commented Jul 2, 2018

Sorry, I said constant I meant literal, there are many double literals everywhere in Marlin not sure why, well.. its probably because it didn't matter with AVR.

@ejtagle
Copy link
Contributor Author

ejtagle commented Jul 2, 2018

Yes, i agree. Codebase is full of double literal values. Technically, that could force promotion to double of some calculations...

@thinkyhead
Copy link
Member

probably because it didn't matter with AVR.

Basically, and converting all the values to float by appending f is a lot of work.

Some MCUs might actually prefer double if that's the native type that they use in their FPUs. Maybe we need a decimal type which refers to the format preferred by the FPU, if there is one.

@thinkyhead
Copy link
Member

Some of these changes look like they'd be worthwhile for bugfix-1.1.x too.

@ejtagle
Copy link
Contributor Author

ejtagle commented Jul 2, 2018

@thinkyhead : There is no embedded MPU supporting hardware double precision floating point (except STM CortexM7, but we don´t support those boards). The other exception would be Raspberry Pi or a PC

There are options to force GCC to use floats for everything, -fsingle-precision-constant . That would probably be the best choice, but Arduino does not allow to specify that compilation option...

What i am mostly interested in, is the optimizations to the Delta kinematics... Probably 10% faster o maybe more...

Copy link
Member

@thinkyhead thinkyhead Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since M_PI is now defined with f on the end, casting it might be redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RECIPROCAL is safer, but also a little slower. If we know the parameter won't ever be 0.0 then it's better to just use 1.0 / x.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right here. That is why a review is always welcome !! 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is LROUND actually better here, since the return type is float?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On AVR is does not matter. On ARM, round would force a conversion from double->float, and the parameter of round() is double, so it also forces a conversion from float->double

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. These would definitely be faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ejtagle
Copy link
Contributor Author

ejtagle commented Jul 3, 2018

(btw: the compilation is failing because the success variable is not defined ... 👍 🌊 ... )

@thinkyhead
Copy link
Member

Marlin loves me.
Yes I know.
Travis CI
told me so.

@thinkyhead thinkyhead changed the title [2.0.x] Misc fixes and improvements [2.0.x] Use 'float' instead of 'double' maths Jul 4, 2018
@thinkyhead
Copy link
Member

The delta optimizations seem good to me!

I've heard that it's more optimal on AVR to use const float & arguments to functions rather than passing a float because a pointer is smaller than a float on this architecture. Since this doesn't help with processors that have 32-bit addresses, maybe we should use const float & only AVR and simply float on others. Or maybe the compiler can be made to optimize these parameters automatically.

@ejtagle
Copy link
Contributor Author

ejtagle commented Jul 5, 2018

@thinkyhead : And they gave you good advice. On AVR pointers are 16bit, requiring 2 registers to be passed by reference. floats are 32bits, so passing them by reference uses 2 registers, and passing them by value uses 4 registers. So passing by reference is 2x faster...
But, dereferencing the pointer (or reference) to read the float value adds as much time as passing the float itself, so, it depends... If the function will be inlined, there is no difference... If a float will be passed several levels below, then passing them by reference is faster...

@thinkyhead thinkyhead merged commit dde009e into MarlinFirmware:bugfix-2.0.x Jul 6, 2018
fiveangle pushed a commit to fiveangle/Marlin that referenced this pull request Jul 9, 2018
-normalize `env` and `board` to lowercase naming convention
-make board `name` follow descriptive convention
-implement `-fsingle-precision-constant` compile optimization per MarlinFirmware#11178 (comment)
-fix typo in 5DPRINT entry
fiveangle pushed a commit to fiveangle/Marlin that referenced this pull request Jul 14, 2018
-normalize `env` and `board` to lowercase naming convention
-make board `name` follow descriptive convention
-implement `-fsingle-precision-constant` compile optimization per MarlinFirmware#11178 (comment)
-fix typo in 5DPRINT entry
thinkyhead pushed a commit that referenced this pull request Jul 26, 2018
-normalize `env` and `board` to lowercase naming convention.
-make board `name` follow descriptive convention.
-implement `-fsingle-precision-constant` compile optimization per #11178 (comment)
-fix typo in 5DPRINT entry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants