Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Crash during Print, LA, IS, Bugfix.2.1x #25553

Closed
1 task done
jesterhead82 opened this issue Mar 23, 2023 · 74 comments
Closed
1 task done

[BUG] Crash during Print, LA, IS, Bugfix.2.1x #25553

jesterhead82 opened this issue Mar 23, 2023 · 74 comments

Comments

@jesterhead82
Copy link
Contributor

jesterhead82 commented Mar 23, 2023

Did you test the latest bugfix-2.1.x code?

Yes, and the problem still exists.

Bug Description

Maybe related to
#25541
#25474

Printer crashes during print @2.88mm Layer, tried on 2 different sd cards
Prints fine if I disable LA, ~50/100mm Print speeds, IS @ ~43, LA 0.02, ESTEPS ~700

Configuration.zip
GCode.zip

Bug Timeline

No response

Expected behavior

No response

Actual behavior

No response

Steps to Reproduce

Version of Marlin Firmware

Bugfix 2.1.x as of this week

Printer model

Custom Corexy

Electronics

BTT 1.4T

Add-ons

No response

Bed Leveling

ABL Linear grid

Your Slicer

Prusa Slicer

Host Software

None

Don't forget to include

  • A ZIP file containing your Configuration.h and Configuration_adv.h.

Additional information & file uploads

No response

@ThomasToka
Copy link
Contributor

ThomasToka commented Mar 23, 2023

I can confirm that last commits are somehow faulty. I poke around today around my printer stopping at my usual cube at around 40%.

My last working build has this as last commit: 10983d0

Will disable LA now to check if it works without LA with my recent includes of the latest commits.

EDIT: Confirmed. Disabling LA fixes it. IS stayed enabled. My printer is a E3S1Pro with TMC2208_STANDALONE.

@ThomasToka
Copy link
Contributor

Reverting this one fixes it ca77850

I have not fully understood it but maybe this missing else here is it:

e8b0f38

@thisiskeithb
Copy link
Member

@tombrazier Can you take a look at this?

@tombrazier
Copy link
Contributor

@jesterhead82 @ThomasToka Can you describe exactly what your printer does at the point where the print fails? My guess is movement freezes and then after a few seconds the firmware reboots. But if it is something other than that please describe the behaviour. Also: does the E stepper have any noticeable behaviour when the crash occurs?

@tombrazier
Copy link
Contributor

I have not fully understood it but maybe this missing else here is it:

@ThomasToka What mainboard do you have? I think @jesterhead82 has a 32 bit MCU, in which case that code is not compiled.

@ThomasToka
Copy link
Contributor

print stops and printer reboots. always at the same position. cube at around 40%.

i have creality f4 board on a ender 3 s1 pro. removing the commit fixes it.

@ThomasToka
Copy link
Contributor

nothing else hearable.

@jesterhead82
Copy link
Contributor Author

I have not fully understood it but maybe this missing else here is it:

@ThomasToka What mainboard do you have? I think @jesterhead82 has a 32 bit MCU, in which case that code is not compiled.

Board has a LPC1769, so yes.

As for the stepper, there is some "eechy" noise shortly before the reboot, i guess some motor is doing something stupid. But it could be the board/display maybe, could try again tomorrow is that would be helpfull..

@tombrazier
Copy link
Contributor

It could just possibly be a division by zero caused by the changes to all the calls to calc_timer_interval() in ca77850. Could one or both of you try changing them back?

@ThomasToka
Copy link
Contributor

yes as i said if i revert this commit it works.

@PendulumPit
Copy link

I setup the latest bugfix the other day to do a speed test on a benchy and got through that no issue. After I turned on LA as usual and printer would stop and reboot. Tried again and same. I was going to mention this the other day but I got busy with work.
Was trying to beat Tom's 13 minute benchy lol I failed from a layer shift :)

@thisiskeithb
Copy link
Member

As for the stepper, there is some "eechy" noise shortly before the reboot

My Prusa Bear with a BTT002 (STM32F4) has been printing all day with bugfix-2.1.x (724ba4b) without issue, but the Biqu Hurakan with Manta M4P (STM32G0) freezes & reboots in the same spot consistently with a similar "eechy" symptom after ~4-5 layers. Both are running Linear Advance & Input Shaping is disabled.

After reverting ca77850, the reboots stop.

@tombrazier
Copy link
Contributor

Could someone test #25557?

@ThomasToka
Copy link
Contributor

printing. result will follow in 20 min.

@ThomasToka
Copy link
Contributor

Confirmed. Works again.

@ThomasToka
Copy link
Contributor

Trying now a print with the follow up a3c4f25 .

result in some minutes..

@ThomasToka
Copy link
Contributor

Nope crashed again on same position.

@tombrazier
Copy link
Contributor

Thanks. So it appears that it's not a division by zero. I would really like to understand what is going on before proceeding.

I don't have time now to try and narrow it down but if anyone wants to play, it would be good to know exactly which of the four calls to calc_timer_interval() is the one that is causing the error. And if anyone is able to insert a SERIAL_ECHOLNPGMto report the offending value that is passed to calc_timer_interval() that would be very helpful.

@ThomasToka
Copy link
Contributor

Can you give me a matching SERIAL_ECHOLNPGM so i could try?

@tombrazier
Copy link
Contributor

For the call to calc_timer_interval() on around line 2260 it would be

SERIAL_ECHOLNPGM("step_rate ", (acc_step_rate + la_step_rate) >> current_block->la_scaling);

That will print out on the serial terminal, assuming you're using something like Octoprint or Prontrface.

@tombrazier
Copy link
Contributor

And you can keep adding pairs of strings and values. e.g.

SERIAL_ECHOLNPGM("step_rate ", (acc_step_rate + la_step_rate) >> current_block->la_scaling, " asr ", acc_step_rate, " lsr ", la_step_rate, " las ", current_block->la_scaling);

@ThomasToka
Copy link
Contributor

Can you give me a examples for all 4? I will then do the tests and post. I dont know what you exactly need on the four points.
But can invest the time to debug.

@tombrazier
Copy link
Contributor

Just copy the expression that is passed to calc_timer_interval() and, if it makes sense, also the variables that make up that expression. Sorry to be difficult, I just really need to be doing something else today.

@ThomasToka
Copy link
Contributor

ThomasToka commented Mar 24, 2023

Sorry i am not that good in serial operations.. but will help best i can.

SERIAL_ECHOLNPGM("step_rate ", (acc_step_rate + la_step_rate) >> current_block->la_scaling, " asr ", acc_step_rate, " lsr ", la_step_rate, " las ", current_block->la_scaling);

la_interval = calc_timer_interval((acc_step_rate + la_step_rate) >> current_block->la_scaling);
SERIAL_ECHOLNPGM("step_rate2 ", (reverse_e ? la_step_rate - step_rate : step_rate - la_step_rate) >> current_block->la_scaling, " sr ", step_rate, " lsr ", la_step_rate, " las ", current_block->la_scaling);
 
la_interval = calc_timer_interval((reverse_e ? la_step_rate - step_rate : step_rate - la_step_rate) >> current_block->la_scaling);
SERIAL_ECHOLNPGM("step_rate3 ", (current_block->initial_rate + la_step_rate) >> current_block->la_scaling, " lsr ", la_step_rate, " las ", current_block->la_scaling);

la_interval = calc_timer_interval((current_block->initial_rate + la_step_rate) >> current_block->la_scaling);

like this or after the la_interval ?

@vovodroid
Copy link
Contributor

More than 1 hour print on BTT SKR 2 seems to be OK.

@dwzg
Copy link
Contributor

dwzg commented Apr 1, 2023

Haven't had a crash again with the most recent bugfix branch. Even a print that would always trigger the crash at a specific point is now running fine on my SKR E3 Turbo.

@PendulumPit
Copy link

PendulumPit commented Apr 2, 2023

For anyone following this issue / experienced crashes, please download bugfix-2.1.x to test with the latest code and let us know if you're still having this issue.

Be sure to attach a ZIP file containing your Configuration.h, Configuration_adv.h, and G-code in your reply.

I will give it a go. I had the issue when testing awhile back. Thanks to the Team for your hard work. 👍

@SpannMagoo
Copy link

SpannMagoo commented Apr 2, 2023

Crashed right when its about to do purge line. Creality 4.2.7 Ender 3
Configuration_4.2.7.zip
ringing_tower_56m_0.20mm_200C_PLA-2.0.zip

@thisiskeithb
Copy link
Member

thisiskeithb commented Apr 2, 2023

Crashed right when its about to do purge line. Creality 4.2.7 Ender 3 Configuration_4.2.7.zip

Can you attach the g-code you were printing with? Which PIO environment did you use?

@SpannMagoo
Copy link

Posted in same post. Used vscode, Auto Build Marlin.

@tombrazier
Copy link
Contributor

@SpannMagoo Can you check whether it still happens in my new test branch?

https://github.com/tombrazier/Marlin/tree/lafreeze2

@SpannMagoo
Copy link

SpannMagoo commented Apr 2, 2023

OK, ran the newest branch as requested. I ran 3 tests and failed on the 2nd test at the purge line. See below for configs and the .gcode
Configuration_LatestPR4.2.7.zip
ringing_tower_56m_0.20mm_200C_PLA-3.0.zip

@SpannMagoo
Copy link

Also just noticed it only does the first 3 points when doing the g29 and stops.

@tombrazier
Copy link
Contributor

tombrazier commented Apr 4, 2023

Additional info: @SpannMagoo tells me that when the hang occurs, it is an M112 halt, complete with a screen saying "M112 Shutdown". This is not the same bug.

[Edit: looking at the code, it appears that the only way you can get there is if M112 is sent to the printer.]

@SpannMagoo
Copy link

Additional info: @SpannMagoo tells me that when the hang occurs, it is an M112 halt, complete with a screen saying "M112 Shutdown". This is not the same bug.

[Edit: looking at the code, it appears that the only way you can get there is if M112 is sent to the printer.]

This is interesting because I did not have the pi plugged in the last time.

@tombrazier
Copy link
Contributor

@SpannMagoo reports (offline conversation) that the M112 bug began after the 18 March edition of bugfix. There were a lot of merges on 18 March and depending on quite where the snapshot occurred which SpannMagoo downloaded, this may or may not mean that there are a lot of other potential changes that could be related to this bug.

I think this probably warrants a new bug report.

As for this one, i.e. #25553, no one else is reporting anything here.

@thisiskeithb
Copy link
Member

I think this probably warrants a new bug report.

@SpannMagoo: Please open a new bug report for your issue.

@ThomasToka
Copy link
Contributor

I had not the time and capacity to test very much.

But: I did a fresh compile of bugfix-2.1.x downloaded 2 days ago. Enabled LA as always and was not suffering from this problem.

Where i am affected is if a activate this commit with my fork of a ender 3 s1 port: ca77850

Also the follow up fixes for this do not work. If i revert this commit all is good.

I had maybe 25 testers in the last 4 weeks for my firmware fork. All of them Ender 3 s1 Pro or Plus. Only 2 of them suffered also from this commit. Reverting this commit fixed the halts for both affected.

But it may be really something other that this here mentioned. I had on my printer no etchy sound or something. It just halted.

@cybercatnet
Copy link

I am using the latest build and on the fourth layer it starts printing slow until the print stops completely and the extruder motor starts making noises. The program stops working and does not respond.

@tombrazier
Copy link
Contributor

@cybercatnet Could you elaborate a bit please? What do you mean by "the latest build"? And could you post your gcode? And is this repeatable?

@cybercatnet
Copy link

@cybercatnet Could you elaborate a bit please? What do you mean by "the latest build"? And could you post your gcode? And is this repeatable?

@tombrazier

The build commit 7369a6a

Example G-Code cap_large.zip

@tombrazier
Copy link
Contributor

Next question: can I have your config please. And what printer / mainboard do you have?

And can you check whether the freeze still happens with my branch here.

@tombrazier
Copy link
Contributor

@ThomasToka What actually is the difference between your fork and upstream Marlin? Looking at your branch MARLIN-E3S1PRO-FORK-BYTT it appears to be identical.

And could you post your config and platformio.ini please (I realise you might already have done so above but I want the latest and don't have the capacity to search the conversation history).

And do you know whether the reported bugs are on F1, F4 or both?

@ThomasToka
Copy link
Contributor

@tombrazier i have rebuild my sources last week on the actual sources and ported the creality code. i will release the commits in the next 30 days cause i have to clean up a bit before public release. with this newer sources the error did not occure once.

it did only occur with two users. both ender3s1pro or plus with creality mainboards. the build it occured was the build before this i used. this was based on the synman release of the creality port. https://github.com/synman/Marlin/tree/bugfix-2.1.x i is reproducable if you add the mentioned commit to this. ca77850

@ThomasToka
Copy link
Contributor

ThomasToka commented Apr 13, 2023

Configs:
configuartions.zip
Platformio:
platformio.zip

@cybercatnet
Copy link

cybercatnet commented Apr 19, 2023

Next question: can I have your config please. And what printer / mainboard do you have?

And can you check whether the freeze still happens with my branch here.

@tombrazier, sorry for the delay, I tried the version that is in your lafreeze2 branch, the crash problem remains the same, it fails at the same moment, but also specifically the motors brake at different times during printing (this did not happen in the Master branch). I hope it helps, anything let me know.

Printer P3Steel, Arduino Mega2560
Config lafreeze2.zip
Config commit 7369a6a.zip

@thinkyhead
Copy link
Member

@cybercatnet — Are you still seeing such issues with the current bugfix-2.1.x? Is there some earlier point in time between the release of 2.1.2 and today where the problem does not exist?

@cybercatnet
Copy link

cybercatnet commented May 17, 2023

@cybercatnet — Are you still seeing such issues with the current bugfix-2.1.x? Is there some earlier point in time between the release of 2.1.2 and today where the problem does not exist?

@thinkyhead
Testing the build commit 3136435
Is not crashing but the motors stop some miliseconds, like 250ms, in random moments (short travel times?).
I dont know how to see the commit number in the files for the build where the problem does not exist, the download date was march 7 21:23.
Config commit 3136435.zip

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Jul 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants