-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FW can't recover from DAI Xrun #645
Comments
@keyonjie @slawblauciak Could this be the same issue of https://github.com/thesofproject/sof/issues/506? |
@mengdonglin this Xrun is initiated by FW so it's different with that injected from host. |
@keyonjie what pipeline are you using ? Is it all HDA endpoints, or does the same thing happend on non HDA endpoints like BYT. @mengdonglin can you confirm for BYT to rule out generic code and help point to HDA logic. |
@lgirdwood both playback and capture are I2S/SSP endpoints. |
@keyonjie ok, looks like HDA host GW DMA becomes out of sync with DW DMA. |
Upon investigation I've found the cause of the issue. (Yorp, HDMI playback). When the firmware encounters an xrun it resets all buffers and restarts DMA. It assumes the hardware DMA has been reset, however in this case, HDA DMA hardware buffer pointers remain untouched, they preserve their status from before, as a result, causing a mismatch between FW's software pointers and hardware pointers. The driver should reset the hardware DAI HDA DMA read/write pointers on xrun. |
@slawblauciak do we have easy way to disable Xrun recover for debugging purpose? e.g. reset or stop the pipeline and waiting for free cmd. |
@keyonjie unfortunately no |
@keyonjie @slawblauciak let me add a container for these flags in ABI update |
if so, FW should only reset hardware buffer pointers and then sending an IPC to driver and waiting for synchronization before schedule again. after driver handled the Xrun IPC(and acked), FW can try to schedule next copy. does this make sense? |
@bardliao Could you help on resetting DMA when XRUN happens per firmware team request to debug this issue? |
@slawblauciak @keyonjie How about we just retrun SNDRV_PCM_POS_XRUN on the .pcm_pointer function and let ALSA do everything for us? |
@bardliao is it going to reset hardware DMA pointers though? If it will, that's fine I suppose. |
@slawblauciak ALSA basically will stop -> prepare -> start. I think we don't reset hardware DMA during this process, but we can do it. Also, I think we should disable all the XRUN handling on FW, if we decide to let ALSA handle it. |
@bardliao sounds good. Also, I'm sure the driver resets the pointers on either start or stop, so if ALSA calls it externally I'm fairly certain it will happen either way. The issue was that FW was doing exactly that INTERNALLY, stop -> prepare -> start, independently from the driver. |
@slawblauciak I tried to return SNDRV_PCM_POS_XRUN on kernel side but it doesn't work. Please see #506 The ALSA's XRUN handling will cause FW's XRUN. So it become a infinite loop. So I think we can figure out why ALSA's XRUN handling will cause FW's XRUN first. Also, when I run the xrun_injection test today, sometimes FW will broken after it. That is an serious issue. |
@bardliao so, you're saying alsa's XRUN handling (which is just resetting and starting the stream again, right?) stresses the system enough to cause an underrun on the host -> DSP transfer? It is indeed a problem. |
@slawblauciak Yes, I think so. We can disable the pipeline XRUN handling first, and look into this issue. |
This is a duplicate of issue #752 reported by Google. |
yes, it's fixed via #755 , closing it now. |
On APL, I found this issue during debugging thesofproject/linux#266, when doing simultanious playback and capture, sometimes Xrun will happen, and we can't recover from Xrun, and DSP even panic, no any further IPC is sent to driver.
How to reproduce:
repeat run: aplay -Dhw:0,1 test.wav -d 5 & arecord -Dhw:0,1 -f dat -c 2 test9.wav -d 5
dmesg:
sof-logger error trace:
sof-logger trace:
The text was updated successfully, but these errors were encountered: