没有本地下载好Funasr所依赖的model时，Windows使用fap transcribe命令报错 #26

Nick-bit233 · 2024-07-13T08:35:46Z

如题，Windows平台下想使用Funasr进行打标，pip安装了Funasr和modelscope，但是运行fap transcribe命令时报错：

(FishAudio) PS D:\DLWorkshop\audio-preprocess> fap transcribe .\all_data\sliced\ --model-type funasr --recursive
2024-07-13 16:23:40.875 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:72 - Using paraformer-zh model for funasr as default
2024-07-13 16:23:40.892 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:80 - Using 2 workers for processing
2024-07-13 16:23:40.892 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:81 - Transcribing audio files in .\all_data\sliced\
2024-07-13 16:23:46.168 | INFO     | fish_audio_preprocess.utils.transcribe:batch_transcribe:40 - Loading paraformer-zh model for zh transcription
2024-07-13 16:23:46.168 | INFO     | fish_audio_preprocess.utils.transcribe:batch_transcribe:40 - Loading paraformer-zh model for zh transcription
You are using the latest version of funasr-1.1.0
You are using the latest version of funasr-1.1.0
2024-07-13 16:23:46,993 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:46,993 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:23:47,045 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:47,045 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:23:52,031 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:52,031 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:23:52,199 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:52,199 - modelscope - INFO - Use user-specified model revision: master
Downloading: 100%|█████████████████████| 10.6k/10.6k [00:00<00:00, 29.9kB/s]
Download: iic/speech_fsmn_vad_zh-cn-16k-common-pytorch failed!: [WinError 32] 另一个程序正在使用此文件，进程无法访问。: 'C:\\****\\._____temp\\iic\\speech_fsmn_vad_zh-cn-16k-common-pytorch\\README.md'
Downloading: 100%|█████████████████████| 10.6k/10.6k [00:00<00:00, 30.3kB/s]
2024-07-13 16:23:52,760 - modelscope - ERROR - File C****\._____temp\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\README.md integrity check failed, expected sha256 signature is 991885cf850e1629de7ce0624d83916e45791cba8049f0e4899477e7837f4f5e, actual is 
concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\****\.conda\envs\FishAudio\lib\concurrent\futures\process.py", line 246, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "D:\DLWorkshop\audio-preprocess\fish_audio_preprocess\utils\transcribe.py", line 41, in batch_transcribe
    model = AutoModel(
  File "C:\Users\****\.conda\envs\FishAudio\lib\site-packages\funasr\auto\auto_model.py", line 134, in __init__
    vad_model, vad_kwargs = self.build_model(**vad_kwargs)
  File "C:\Users\****\.conda\envs\FishAudio\lib\site-packages\funasr\auto\auto_model.py", line 218, in build_model
    assert model_class is not None, f'{kwargs["model"]} is not registered'
AssertionError: fsmn-vad is not registered

简单排查了一下，发现fap transcribe默认启动2个worker：

# cli\transcribe.py line 19
@click.option(
    "--num-workers",
    help="Number of workers to use for processing, defaults to 2",
    default=2,
    show_default=True,
    type=int,
)

这导致，如果用户没有本地下载好模型，则调用的Funasr进程试图同时向modelscope的同一个._____temp文件夹下载模型文件，这导致Windows系统报错：[WinError 32] 另一个程序正在使用此文件。
加上--num-workers 1参数后不再产生此问题。

(FishAudio) PS D:\DLWorkshop\audio-preprocess> fap transcribe .\all_data\sliced\ --model-type funasr --recursive --num-workers 1
2024-07-13 16:25:13.418 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:72 - Using paraformer-zh model for funasr as default
2024-07-13 16:25:13.433 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:80 - Using 1 workers for processing
2024-07-13 16:25:13.433 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:81 - Transcribing audio files in .\all_data\sliced\
2024-07-13 16:25:17.453 | INFO     | fish_audio_preprocess.utils.transcribe:batch_transcribe:40 - Loading paraformer-zh model for zh transcription
You are using the latest version of funasr-1.1.0
2024-07-13 16:25:18,440 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:25:18,440 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:25:22,084 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:25:22,084 - modelscope - INFO - Use user-specified model revision: master
Downloading: 100%|████████████████| 10.6k/10.6k [00:00<00:00, 29.2kB/s]

建议： fap transcribe指令的默认--num-workers 改为1，或是新增代码来解决workers的下载冲突问题。

The text was updated successfully, but these errors were encountered:

sk0777 mentioned this issue Sep 15, 2024

https://github.com/fishaudio/audio-preprocess/issues/26#issue-2406800780 #29

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

没有本地下载好Funasr所依赖的model时，Windows使用fap transcribe命令报错 #26

没有本地下载好Funasr所依赖的model时，Windows使用fap transcribe命令报错 #26

Nick-bit233 commented Jul 13, 2024

没有本地下载好Funasr所依赖的model时，Windows使用fap transcribe命令报错 #26

没有本地下载好Funasr所依赖的model时，Windows使用fap transcribe命令报错 #26

Comments

Nick-bit233 commented Jul 13, 2024