Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FW can't recover from DAI Xrun #645

Closed
keyonjie opened this issue Nov 28, 2018 · 21 comments
Closed

FW can't recover from DAI Xrun #645

keyonjie opened this issue Nov 28, 2018 · 21 comments
Assignees
Labels
APL Applies to Apollolake platform bug Something isn't working as expected duplicate This issue or pull request already exists

Comments

@keyonjie
Copy link
Contributor

On APL, I found this issue during debugging thesofproject/linux#266, when doing simultanious playback and capture, sometimes Xrun will happen, and we can't recover from Xrun, and DSP even panic, no any further IPC is sent to driver.

How to reproduce:
repeat run: aplay -Dhw:0,1 test.wav -d 5 & arecord -Dhw:0,1 -f dat -c 2 test9.wav -d 5

dmesg:

[ 1167.993423] sof-audio sof-audio: pcm: hw params stream 1 dir 0
[ 1167.993431] sof-audio sof-audio: generating page table for 00000000fbe4404b size 0xff00 pages 16
[ 1167.993459] sof-audio sof-audio: period_bytes:0xfc0
[ 1167.993463] sof-audio sof-audio: periods:16
[ 1167.993475] sof-audio sof-audio: stream_tag 1
[ 1167.993519] sof-audio sof-audio: ipc tx: 0x60010000: GLB_STREAM_MSG: PCM_PARAMS
[ 1167.993756] sof-audio sof-audio: ipc tx succeeded: 0x60010000: GLB_STREAM_MSG: PCM_PARAMS
[ 1167.993840] sof-audio sof-audio: rate_min: 48000 rate_max: 48000
[ 1167.993847] sof-audio sof-audio: channels_min: 2 channels_max: 2
[ 1167.993853] sof-audio sof-audio: rate_min: 48000 rate_max: 48000
[ 1167.993858] sof-audio sof-audio: channels_min: 2 channels_max: 2
[ 1167.993863] sof-audio sof-audio: rate_min: 48000 rate_max: 48000
[ 1167.993867] sof-audio sof-audio: channels_min: 2 channels_max: 2
[ 1167.993874] sof-audio sof-audio: pcm: hw params stream 1 dir 1
[ 1167.993881] sof-audio sof-audio: generating page table for 000000004e635382 size 0xff00 pages 16
[ 1167.993903] sof-audio sof-audio: period_bytes:0xfc0
[ 1167.993907] sof-audio sof-audio: periods:16
[ 1167.993919] sof-audio sof-audio: stream_tag 2
[ 1167.993957] sof-audio sof-audio: ipc tx: 0x60010000: GLB_STREAM_MSG: PCM_PARAMS
[ 1167.994214] sof-audio sof-audio: ipc tx succeeded: 0x60010000: GLB_STREAM_MSG: PCM_PARAMS
[ 1167.994433] sof-audio sof-audio: pcm: trigger stream 1 dir 1 cmd 1
[ 1167.994460] sof-audio sof-audio: ipc tx: 0x60040000: GLB_STREAM_MSG: TRIG_START
[ 1167.994582] sof-audio sof-audio: pcm: trigger stream 1 dir 0 cmd 1
[ 1167.994710] sof-audio sof-audio: ipc tx succeeded: 0x60040000: GLB_STREAM_MSG: TRIG_START
[ 1167.994749] sof-audio sof-audio: ipc tx: 0x60040000: GLB_STREAM_MSG: TRIG_START
[ 1167.994961] sof-audio sof-audio: ipc tx succeeded: 0x60040000: GLB_STREAM_MSG: TRIG_START
[ 1170.020072] sof-audio sof-audio: pcm: trigger stream 1 dir 1 cmd 0
[ 1170.020112] sof-audio sof-audio: ipc tx: 0x60050000: GLB_STREAM_MSG: TRIG_STOP
[ 1170.339926] sof-audio sof-audio: error: ipc timed out for 0x60050000 size 0xc
[ 1170.339945] sof-audio sof-audio: HDA_DSP_REG_HIPCCTL:0x3
[ 1170.339955] sof-audio sof-audio: status: fw entered - code 00000005
[ 1170.339982] sof-audio sof-audio: error: unexpected fault 0x00000000 trace 0x00004000
[ 1170.339989] sof-audio sof-audio: error: waking up any trace sleepers
[ 1170.340021]  Port1: ASoC: trigger FE failed -110
[ 1170.340082] sof-audio sof-audio: pcm: free stream 1 dir 1
[ 1170.340133] sof-audio sof-audio: error: trace IO error
[ 1170.659925] sof-audio sof-audio: error: ipc timed out for 0x60030000 size 0xc
[ 1170.659944] sof-audio sof-audio: HDA_DSP_REG_HIPCCTL:0x3
[ 1170.659954] sof-audio sof-audio: status: fw entered - code 00000005
[ 1170.659981] sof-audio sof-audio: error: unexpected fault 0x00000000 trace 0x00004000
[ 1170.659988] sof-audio sof-audio: error: waking up any trace sleepers
[ 1170.660178] sof-audio sof-audio: pcm: close stream 1 dir 1
[ 1174.244044] sof-audio sof-audio: pcm: trigger stream 1 dir 0 cmd 0
[ 1174.563937] sof-audio sof-audio: error: ipc timed out for 0x60050000 size 0xc

sof-logger error trace:

0      1          DAI          44637303.177083      1983.802124                dai.c:114 	dai_buffer_process() error: Insufficient free bytes for next period. comp_overrun()
0      1          DAI          44637319.322917        16.145834                dai.c:98  	dai_buffer_process() error: Insufficient bytes for next period. comp_underrun()
0      1          DAI          44639303.125000      1983.802124                dai.c:114 	dai_buffer_process() error: Insufficient free bytes for next period. comp_overrun()
0      1          DAI          44639319.375000        16.250000                dai.c:98  	dai_buffer_process() error: Insufficient bytes for next period. comp_underrun()
0      1         PIPE 3.18     44640223.541667       904.166687           pipeline.c:1221	pipeline_xrun_recover()
0      1          DAI          44640388.125000       164.583328                dai.c:98  	dai_buffer_process() error: Insufficient bytes for next period. comp_underrun()
0      1         PIPE 2.12     44640442.291667        54.166668           pipeline.c:1221	pipeline_xrun_recover()

sof-logger trace:

    0      2         PIPE 4.23    1156285694.583333        10.729167           pipeline.c:818   pipeline_params()
    0      2         HOST         1156285698.958333         4.375000               host.c:446   host-params
    0      2          DMA         1156285706.041667         7.083333            hda-dma.c:432   hda-dmac: 4 channel 1 -> get
    0      2          DMA         1156285710.729167         4.687500            hda-dma.c:659   hda-dmac: 4 channel 1 -> config
    0      2       VOLUME         1156285716.510417         5.781250             volume.c:234   par
    0      2          DAI         1156285720.312500         3.802083                dai.c:382   dai_params()
    0      2          DMA         1156285724.218750         3.906250             dw-dma.c:332   dw-dma: 0 channel 1 request
    0      2         PIPE 4.23    1156285733.072917         8.854167           pipeline.c:609   pipeline_prepare()
    0      2         HOST         1156285736.822917         3.750000               host.c:553   pre
    0      2       VOLUME         1156285740.833333         4.010417             volume.c:495   pre
    0      2          DAI         1156285746.145833         5.312500                dai.c:442   dai_prepare()
    0      2          DMA         1156285750.677083         4.531250             dw-dma.c:679   dw-dma: 0 channel 1 config
    0      2          IPC         1156285764.531250        13.854167            apl-ipc.c:148   ipc: msg rx -> 0x60010000, cmd done
    0      2          IPC         1156285947.187500       182.656250            apl-ipc.c:70    ipc: irq dipct 0xe0040000 dipcie 0x0 dipcctl 0x3
    0      2          IPC         1156285959.687500        12.500000            apl-ipc.c:114   ipc: msg rx -> 0x60040000, doing cmd
    0      2          IPC         1156285964.843750         5.156250            handler.c:413   ipc: comp 12 -> trigger cmd 4
    0      2         PIPE 3.17    1156285973.333333         8.489583           pipeline.c:768   pipeline_trigger()
    0      2         HOST         1156285977.604167         4.270833               host.c:277   trg
    0      2          DMA         1156285981.770833         4.166667            hda-dma.c:502   hda-dmac: 5 channel 0 -> start
    0      2          DMA         1156285986.041667         4.270833            hda-dma.c:335   hda-dmac: 5 channel 0 -> enable
    0      2       VOLUME         1156286028.281250        42.239582             volume.c:429   trg
    0      2          DAI         1156286032.552083         4.270833                dai.c:551   dai_comp_trigger(), command = 1
    0      2          DAI         1156286036.302083         3.750000                dai.c:559   dai_comp_trigger(), START
    0      2          DMA         1156286041.406250         5.104167             dw-dma.c:418   dw-dma: 0 channel 0 start
    0      2          DMA         1156286047.187500         5.781250             dw-dma.c:1217  dw-dma-int-register
    0      2          SSP         1156286057.187500        10.000000                ssp.c:824   tri
    0      2          SSP         1156286061.770833         4.583333                ssp.c:764   sta
    0      2          IPC         1156286070.781250         9.010417            apl-ipc.c:148   ipc: msg rx -> 0x60040000, cmd done
    0      2         HOST         1156286084.010417        13.229167               host.c:668   Bea
    0      2          IPC         1156286221.979167       137.968750            apl-ipc.c:70    ipc: irq dipct 0xe0040000 dipcie 0x0 dipcctl 0x3
    0      2          IPC         1156286234.270833        12.291667            apl-ipc.c:114   ipc: msg rx -> 0x60040000, doing cmd
    0      2          IPC         1156286239.531250         5.260417            handler.c:413   ipc: comp 18 -> trigger cmd 4
    0      2         PIPE 4.23    1156286249.895833        10.364583           pipeline.c:768   pipeline_trigger()
    0      2         HOST         1156286254.010417         4.114583               host.c:277   trg
    0      2          DMA         1156286257.916667         3.906250            hda-dma.c:502   hda-dmac: 4 channel 1 -> start
    0      2          DMA         1156286262.187500         4.270833            hda-dma.c:335   hda-dmac: 4 channel 1 -> enable
    0      2       VOLUME         1156286301.666667        39.479168             volume.c:429   trg
    0      2          DAI         1156286305.260417         3.593750                dai.c:551   dai_comp_trigger(), command = 1
    0      2          DAI         1156286308.541667         3.281250                dai.c:559   dai_comp_trigger(), START
    0      2          DMA         1156286312.500000         3.958333             dw-dma.c:418   dw-dma: 0 channel 1 start
    0      2          DMA         1156286317.708333         5.208333             dw-dma.c:1217  dw-dma-int-register
    0      2          SSP         1156286327.343750         9.635417                ssp.c:824   tri
    0      2          SSP         1156286331.718750         4.375000                ssp.c:764   sta
    0      2          IPC         1156286338.593750         6.875000            apl-ipc.c:148   ipc: msg rx -> 0x60040000, cmd done
    0      2          IPC         1156286449.791667       111.197914            apl-ipc.c:173   ipc: msg tx -> 0x90020000
    0      2          IPC         1156286558.229167       108.437500            apl-ipc.c:70    ipc: irq dipct 0x0 dipcie 0x40000000 dipcctl 0x3
    0      2          IPC         1156786467.291667    499909.062500            apl-ipc.c:173   ipc: msg tx -> 0x90020000
    0      2          IPC         1156786847.812500       380.520844            apl-ipc.c:70    ipc: irq dipct 0x0 dipcie 0x40000000 dipcctl 0x3
    0      2          IPC         1157286467.239583    499619.437500            apl-ipc.c:173   ipc: msg tx -> 0x90020000
    0      2          IPC         1157286844.739583       377.500000            apl-ipc.c:70    ipc: irq dipct 0x0 dipcie 0x40000000 dipcctl 0x3
    0      2          IPC         1157786467.187500    499622.437500            apl-ipc.c:173   ipc: msg tx -> 0x90020000
    0      2          IPC         1157786842.500000       375.312500            apl-ipc.c:70    ipc: irq dipct 0x0 dipcie 0x40000000 dipcctl 0x3
    0      2          IPC         1158286467.187500    499624.687500            apl-ipc.c:173   ipc: msg tx -> 0x90020000
    0      2          IPC         1158286841.197917       374.010406            apl-ipc.c:70    ipc: irq dipct 0x0 dipcie 0x40000000 dipcctl 0x3
    0      2          IPC         1158786467.187500    499626.000000            apl-ipc.c:173   ipc: msg tx -> 0x90020000
    0      2          IPC         1158786847.291667       380.104156            apl-ipc.c:70    ipc: irq dipct 0x0 dipcie 0x40000000 dipcctl 0x3
    0      2          IPC         1159286467.187500    499619.906250            apl-ipc.c:173   ipc: msg tx -> 0x90020000
    0      2          IPC         1159286840.833333       373.645844            apl-ipc.c:70    ipc: irq dipct 0x0 dipcie 0x40000000 dipcctl 0x3
    0      2          IPC         1159786467.291667    499626.468750            apl-ipc.c:173   ipc: msg tx -> 0x90020000
    0      2          IPC         1159786845.208333       377.916656            apl-ipc.c:70    ipc: irq dipct 0x0 dipcie 0x40000000 dipcctl 0x3
    0      2          IPC         1160286467.239583    499622.031250            apl-ipc.c:173   ipc: msg tx -> 0x90020000
    0      2          IPC         1160286845.364583       378.125000            apl-ipc.c:70    ipc: irq dipct 0x0 dipcie 0x40000000 dipcctl 0x3
@mengdonglin mengdonglin added the bug Something isn't working as expected label Nov 28, 2018
@mengdonglin
Copy link
Collaborator

@mengdonglin mengdonglin added the APL Applies to Apollolake platform label Nov 28, 2018
@keyonjie
Copy link
Contributor Author

@mengdonglin this Xrun is initiated by FW so it's different with that injected from host.

@lgirdwood
Copy link
Member

@keyonjie what pipeline are you using ? Is it all HDA endpoints, or does the same thing happend on non HDA endpoints like BYT. @mengdonglin can you confirm for BYT to rule out generic code and help point to HDA logic.

@keyonjie
Copy link
Contributor Author

@lgirdwood both playback and capture are I2S/SSP endpoints.

@lgirdwood
Copy link
Member

@keyonjie ok, looks like HDA host GW DMA becomes out of sync with DW DMA.

@slawblauciak
Copy link
Collaborator

slawblauciak commented Dec 6, 2018

Upon investigation I've found the cause of the issue. (Yorp, HDMI playback). When the firmware encounters an xrun it resets all buffers and restarts DMA. It assumes the hardware DMA has been reset, however in this case, HDA DMA hardware buffer pointers remain untouched, they preserve their status from before, as a result, causing a mismatch between FW's software pointers and hardware pointers.

The driver should reset the hardware DAI HDA DMA read/write pointers on xrun.
Possibly DW DMA as well, as I haven't had the setup for reproducing the problem on that.
@ranj063 may I ask you for assistance on this matter?

@keyonjie
Copy link
Contributor Author

keyonjie commented Dec 6, 2018

@slawblauciak do we have easy way to disable Xrun recover for debugging purpose? e.g. reset or stop the pipeline and waiting for free cmd.

@slawblauciak
Copy link
Collaborator

@keyonjie unfortunately no

@lgirdwood
Copy link
Member

@keyonjie @slawblauciak let me add a container for these flags in ABI update

@keyonjie
Copy link
Contributor Author

keyonjie commented Dec 7, 2018

Upon investigation I've found the cause of the issue. (Yorp, HDMI playback). When the firmware encounters an xrun it resets all buffers and restarts DMA. It assumes the hardware DMA has been reset, however in this case, HDA DMA hardware buffer pointers remain untouched, they preserve their status from before, as a result, causing a mismatch between FW's software pointers and hardware pointers.

The driver should reset the hardware DAI HDA DMA read/write pointers on xrun.
Possibly DW DMA as well, as I haven't had the setup for reproducing the problem on that.
@ranj063 may I ask you for assistance on this matter?

if so, FW should only reset hardware buffer pointers and then sending an IPC to driver and waiting for synchronization before schedule again. after driver handled the Xrun IPC(and acked), FW can try to schedule next copy.

does this make sense?

@mengdonglin
Copy link
Collaborator

@bardliao Could you help on resetting DMA when XRUN happens per firmware team request to debug this issue?

@bardliao
Copy link
Collaborator

@slawblauciak @keyonjie How about we just retrun SNDRV_PCM_POS_XRUN on the .pcm_pointer function and let ALSA do everything for us?

@slawblauciak
Copy link
Collaborator

@bardliao is it going to reset hardware DMA pointers though? If it will, that's fine I suppose.

@bardliao
Copy link
Collaborator

@slawblauciak ALSA basically will stop -> prepare -> start. I think we don't reset hardware DMA during this process, but we can do it. Also, I think we should disable all the XRUN handling on FW, if we decide to let ALSA handle it.

@slawblauciak
Copy link
Collaborator

@bardliao sounds good. Also, I'm sure the driver resets the pointers on either start or stop, so if ALSA calls it externally I'm fairly certain it will happen either way. The issue was that FW was doing exactly that INTERNALLY, stop -> prepare -> start, independently from the driver.

@bardliao
Copy link
Collaborator

@slawblauciak I tried to return SNDRV_PCM_POS_XRUN on kernel side but it doesn't work. Please see #506 The ALSA's XRUN handling will cause FW's XRUN. So it become a infinite loop. So I think we can figure out why ALSA's XRUN handling will cause FW's XRUN first. Also, when I run the xrun_injection test today, sometimes FW will broken after it. That is an serious issue.

@slawblauciak
Copy link
Collaborator

@bardliao so, you're saying alsa's XRUN handling (which is just resetting and starting the stream again, right?) stresses the system enough to cause an underrun on the host -> DSP transfer? It is indeed a problem.

@bardliao
Copy link
Collaborator

@slawblauciak Yes, I think so. We can disable the pipeline XRUN handling first, and look into this issue.

@mengdonglin
Copy link
Collaborator

This is a duplicate of issue #752 reported by Google.

@mengdonglin mengdonglin added the duplicate This issue or pull request already exists label Dec 19, 2018
@wenqingfu
Copy link

fixed by #755? @keyonjie to confirm and close

@keyonjie
Copy link
Contributor Author

yes, it's fixed via #755 , closing it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
APL Applies to Apollolake platform bug Something isn't working as expected duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

8 participants