-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smoother motion by fixing calculating trapezoids and ISR stepping. #27013
Conversation
Fix rounding directions in calculate_trapezoid_for_block(). Fix off-by-ones errors in ac/deceleration steps in block_phase_isr. Half-initialize ac/deceleration_time to smooth the speed change shock that happens between segments, which is critical as jerk/deviation adds to this. The result is a smoother motion profile that follows the imposed acceleration limits with a well defined 0.5-1.5x error factor (or 2x if axis is starting from ~0). Errors are due to converting a real-valued motion profile into discrete numbers of steps. Fixes are general and improve S_CURVE_ACCELERATION too (no endorsement implied). Tested by looking at the generated step/dir impulses with a logic analyzer. Enjoy the smoother motion or use more aggresive acceleration/jerk/deviation values for faster prints. Also improves: MarlinFirmware#12491
Kudos - This may well be the fix to some of the most hard problems to solve |
Issue has been reopened & unlocked. |
Thank you for doing this excellent analysis and correction. Everything looks great in those graphs. (And nothing pleases us more than pretty graphs.) We'll prioritize getting this reviewed and tested because this is obviously a vital area of the firmware! Is it confirmed that the number of steps is not being modified in any way? I ask because in the past we have had issues where steps were off by +-1, so we had to take great care to use I agree that S-curve acceleration could use some adjustments to produce nicer looking curves. It may also be worthwhile to add S-Curve acceleration (or our best-guess for Fast B-Spline) to the FT Motion system after its extant patches are merged. |
I've explicitly aimed to only modify the speed profile and I didn't touch the bresenham part. I've opened up the captures and at least whenever there was a direction change the positions matched between before/after captures. I did an (admittedly simple) test print with a just slightly earlier version of this code.
From what I'm now reading FT Motion seems to be a vibration compensation mechanism. The whitepaper I've found mentioned 'inverted commands'. I might look into it more in the future. |
Marlin/src/module/stepper.cpp
Outdated
@@ -2462,7 +2468,7 @@ hal_timer_t Stepper::block_phase_isr() { | |||
*/ | |||
#if ENABLED(LASER_POWER_TRAP) | |||
if (cutter.cutter_mode == CUTTER_MODE_CONTINUOUS) { | |||
if (step_events_completed + 1 == accelerate_until) { | |||
if (step_events_completed + 1 == accelerate_before) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the new naming did highlight this: I think it should be without the +1 but I'm not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll have to analyze the intent of this line and look at its history. Previously it wanted to match accelerate_until - 1
which would be the second-to-last step of acceleration. Assuming that we want no change in behavior, and assuming that .accelerate_before
is now one more than accelerate_until
, the actual correction would be step_events_completed + 2 == accelerate_before
Again, we need to look closer at the original intent of the line, because, for one thing, step_events_completed
is not incremented by 1, but by events_to_do
which may or may not always line up correctly to match the exact index accelerate_before - n
. Once we fully grok the intent of this line we can simply replace it with whatever best fits the updated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see if @descipher has anything to add….
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it's been quite some time since my head was deep inside this code! "first block->accelerate_until step_events_completed" is the key element, since the first block starts at 0 we will have value 1 after the ISR does the count increment, thus the < accelerate_until naming usage which is not the same meaning as accelerate_before, it only looks that way when viewing it as less than accelerate_until at the moment of comparison. However this comparison moment is the actual accelerate_until PIT and not before.
In regards to the +1 == if I recall correctly the reason for + 1 stems from an odd step count and the PIT when the laser would need to be turned off or incr/decr . It worked out that the counts were shifted toward the tail of the trap so the counts were consistently short on the decel side of the trap. This can be viewed by turning on debug where it will give you the actual step counts. It becomes obvious with debug on and the simulator is your friend for that ;) The laser power will be inc'd and dec'd closer to ideal with the +1. Keep in mind its not uniform in a real world print or laser burn, but in tests from a dead start to full stop the ramps should be equal and that's when this shifted count behavior was first observed. If there is a change in the counts/shaping now it should be re-evaluated again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My worry is that with some moves it will never be that step_events_completed + 1 == accelerate_before
during the block_phase_isr
because AFAIK this code is not guaranteed to be executed after every step, but only when some new "block phase" is "needed." And, as mentioned, step_events_completed
is not incremented by 1, but by events_to_do
.
The best thing to do would be to test this in the simulator, not with laser enabled, but just with that comparison being done at that same point in the code to see how it matches up and whether we are in fact guaranteed to be able to scrutinize every step index. If we can, then the very last index of acceleration will be accelerate_before - 1
and we can apply that correction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The best thing to do would be to test this in the simulator
As a reminder, it is possible to log signals in the simulator and export them to files that can be viewed in PulseView. This allows you to analyze stepper pulses without requiring a physical logic analyzer setup. I find that it provides a much faster feedback loop than working with actual hardware.
It won't work for hardware PWM signals like you would have for a laser, but for any software-timed pulses it works well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Sim compiling result:
Took a look at the simulation outputs and captured a logic vcd of (X|Y)_STEP_PIN it looks reasonable however there is still the count shifting towards the tail of the trap where we can observe the resulting faster decel step rate.
This is the X_STEP_PIN Y_STEP_PIN output point and the zoom is the X axis.
Here the start pulse interval is 3.187ms and the stop is 2.27ms. This is a dead stop to dead stop thus we should have equal sides on the trap. We have a significantly harder decel due to the shorter decel count calc.
@thinkyhead We can use ( + 1 >=) to ensure we set the laser power on a missed compare. The intent of this code is to set the power to the current_block planner calculated "Cruise" laser power value. Since this code only happens on the cruise else section it should have no serious negative impact with the addition of >. If we end up fixing the counts then the +1 can be removed. So why use == to save compute cycles, if we use > it would increase the compute load by unnecessarily reapplying the power level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there are LASER_POWER_TRAP
specific issues I've created #27031 to address them. The issues addressed in a way that's independent of this change.
Now to address how this change affects laser: I looked over how LASER_POWER_TRAP
uses ac/decelerate_steps
and (with the #27031 fixes applied too) the resulting ramp up/down is smoother than before. Here's a comparison at the end of a deceleration when the laser is going to 0% power.
Before:
(just a note that my steppers are set to edge stepping)
I've also looked at the acceleration->cruise transition. I'd say the difference is not significant / not above the noise in the pwm/logic analyzer capture: Last step 75.4 -> 80% versus 76.7% -> 80%.
Took a look at the simulation outputs and captured a logic vcd of (X|Y)_STEP_PIN it looks reasonable however there is still the count shifting towards the tail of the trap where we can observe the resulting faster decel step rate. [..]
Here the start pulse interval is 3.187ms and the stop is 2.27ms. This is a dead stop to dead stop thus we should have equal sides on the trap. We have a significantly harder decel due to the shorter decel count calc.
@descipher You are right about the absolute ideal being equal intervals. And that even with this change the first step is slower than the last step, or stated another way: the first step is at exactly initial_rate
but the last step isn't yet at final_rate
. Two answers:
- the difference was way worse before this change as the decel count was even lower
- trying to hit the absolute ideal profile results in a code complexity explosion. This is because the
ac/deceleration_steps
computation has to work for all the following types of segments: full acceleration, full deceleration, acceleration->deceleration, acceleration->cruise->deceleration (5 cases). And say we do that, and get the first step at exactlyinitial_rate
and the last at exactlyfinal_rate
(=initial_rate
of next segment). Now the full acceleration->full acceleration and the full deceleration->full deceleration transitions become jerky because whenfinal_rate
andinitial_rate
of the next segment are equal that means there's no acceleration happening between segments, i.e. we go full accel, 0 accel, full accel. Special casing the 5 cases into 5 cases transitions is just too much. Also the float->discrete steps forces us to accept an error somewhere. To avoid this complexity explosion the way this code is structured is to accept an error of ~1/2 of the ideal acceleration. This half error manifests as not hitting thefinal_rate
exactly.
@thinkyhead I added a couple comments on a couple of commits you've added (Making a comment here since I'm not sure if they're easily visible) |
This analysis is correct. It is good to see it so clearly demonstrated. |
Note #26881 which also affects the
|
It's going to take me a while to get my head round these changes. I do have the same question as Scott about whether step counts might end up incorrect. |
Re step counts: I've checked before/after captures and did a diff of the steps to verify they are correct. For the simple gcode above, the capture shows X going from 0 to 17.000mm then back to -10.000mm which matches the simple gcode above given a start right after @tombrazier Thanks for mentioning #26881 - I've made a comment there. On a separate note, I thought I should check how the new code interacts with oversampling. So I manually set an 2^3=8x factor: constexpr uint8_t oversampling_factor = 3;` Before with 8x oversamplingHighlighted the spikes fixed or a bit improved (~). After with 8x oversamplingAfter I also tested with |
If we have good fairly stable entry points into the impacted code we could add some unit tests to be sure it remains unchanged. I believe I have an old branch around where I had done just that while attempting to resolve some jerk/accel issues. I need to see if I can find that and see whether those tests can be effectively updated to work with the current state of the planner. |
I'm slowly working my way through this. First observation: really nice work with the logic analyser. Smoother movement is always better. I don't know whether instantaneous small speed changes will affect print quality or not but believe they might and I suspect they will at least be audible. I think there are several discrete changes which I will review and comment on individually. @mh-dm can you confirm my understanding? |
I don't think any of the changes will result in missed or extra steps. The way to test this is to build with |
The code that ends up changing step counts is distributed all over the place and some of it is asynchronous. I think it would be difficult to unit test. |
@tombrazier I've noticed that audible judder is visible on 'sensitive' prints like spiralize/vase mode prints. In particular, the difference between non-oversampling and high oversampling/
All the changes are with the goal of getting the ¯\_/¯\_ as close to ideal but yeah, they can be reviewed line by line. |
The only way we can unit test motion is to add unit testing code that invokes As for unit testing something as complicated as a complete motion path to make sure the pulses aren't out of whack, that would require a bit more work. The Stepper ISR would need to write values of its variables into a large buffer, then the test would need to analyze the buffer for "bad stuff." |
It has that, but it's a near complete replacement for Marlin's motion system. FT Motion still calculates trapezoids the same way, but it runs on a fixed time clock and pre-buffers all the STEP/DIR events (including "do nothing" ones). As far as I can tell the changes here don't require any corresponding changes in FT Motion. |
I looked back at what I had done earlier. I was focused on Classic Jerk problems at the time, which @mh-dm may have already fixed separately back in #26529. I had to refactor some things to isolate the problem logic so that I could write tests around it. The focus was on the transition speeds between blocks while applying jerk, and not in the same code sections being modified here. That refactoring is probably still worth doing so we can test CLASSIC_JERK, but doesn't help with this PR. If we can extract and isolate important/complex/problematic logic, then it becomes possible for us to wrap them in tests that will be validated with every PR posted. For something like step counts, that wouldn't be an end-to-end "G Code to Stepper Pulses" test. You would focus on testing the different blocks of logic on their own. If they are really dependent on each other, then we probably need to be looking at refactoring the code to bring those mechanisms together, to make testing them together possible. |
Hi, I'm checking in on this PR.
|
There is a case: if I now also understand what you meant @mh-dm when you said:
This is ultimately addressed in #27035. For now, though, this adds another reason to use the original solution from #26881, thereby keeping a lower bound on
I mentioned two redundant lines of code here. Apart from that, I think this PR is ready to go. |
But this case is already broken, in a more severe way. When
I can't find what you're referring to in that link. Sorry, can you remake the comment? |
Not in the original version of #26881 which has now been restored. That version has
My comment was that And |
No, there isn't. It could be improved further but it's better as it is than not merging it and I can do the extra improvements at some point. |
Thanks. However, it's not redundant since that initialization was changed to be |
@mh-dm I have just realised most of my review comments were all still pending which is why only I could see them. Anyway, I have spotted some more things I want to discuss and am deleting comments that are now redundant. Hopefully I can actually publish my pending review soon. |
I have reviewed all the code changes here and discussed questions I have with @mh-dm. Some of the changes are minor (e.g. the |
@thinkyhead In case you're preparing to merge, shouldn't #27031 go in before this? (or do you not want that one at all?) |
Description
Fix rounding directions in
calculate_trapezoid_for_block()
. Fix off-by-ones errors in ac/deceleration steps inblock_phase_isr()
. Half-initializeac/deceleration_time
to smooth the speed change shock that happens between segments, which is critical as jerk/deviation further adds to the shock.Requirements
Affects all.
Benefits
The result is a smoother motion profile that follows the imposed acceleration limits with a well defined 0.5-1.5x error factor (or up to 2x if axis is starting from ~0). Errors are due to converting a real-valued motion profile into discrete numbers of steps. Fixes are general and improve
S_CURVE_ACCELERATION
too (no endorsement implied).Enjoy the smoother motion or use more aggressive acceleration/jerk/deviation values for faster prints.
Before
Here's a logic analyzer capture for the X axis before the change showing massive acceleration peaks (highlighted in red):
There's a bit of speed dependent measurement noise (most obvious on the right, just ignore it) since my current logic analyzer is limited to capturing at 2Mhz.
After
After, the peaks are gone/significantly reduced. Furthermore many times when switching between accel to plateau to decel the acceleration change happens over 2 steps instead of 1 (highlighted with green checkmarks). This means smoother motion / reduced jolt:
Configurations
Tested by looking at the generated step/dir impulses with a logic analyzer. Tested with low jerk so it doesn't muddy the picture.
Here's the gcode for these captures:
Related Issues
NOMORE(initial_rate, block->nominal_rate);
sanity limits