Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

没有本地下载好Funasr所依赖的model时,Windows使用fap transcribe命令报错 #26

Open
Nick-bit233 opened this issue Jul 13, 2024 · 0 comments

Comments

@Nick-bit233
Copy link

如题,Windows平台下想使用Funasr进行打标,pip安装了Funasr和modelscope,但是运行fap transcribe命令时报错:

(FishAudio) PS D:\DLWorkshop\audio-preprocess> fap transcribe .\all_data\sliced\ --model-type funasr --recursive
2024-07-13 16:23:40.875 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:72 - Using paraformer-zh model for funasr as default
2024-07-13 16:23:40.892 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:80 - Using 2 workers for processing
2024-07-13 16:23:40.892 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:81 - Transcribing audio files in .\all_data\sliced\
2024-07-13 16:23:46.168 | INFO     | fish_audio_preprocess.utils.transcribe:batch_transcribe:40 - Loading paraformer-zh model for zh transcription
2024-07-13 16:23:46.168 | INFO     | fish_audio_preprocess.utils.transcribe:batch_transcribe:40 - Loading paraformer-zh model for zh transcription
You are using the latest version of funasr-1.1.0
You are using the latest version of funasr-1.1.0
2024-07-13 16:23:46,993 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:46,993 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:23:47,045 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:47,045 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:23:52,031 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:52,031 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:23:52,199 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:52,199 - modelscope - INFO - Use user-specified model revision: master
Downloading: 100%|█████████████████████| 10.6k/10.6k [00:00<00:00, 29.9kB/s]
Download: iic/speech_fsmn_vad_zh-cn-16k-common-pytorch failed!: [WinError 32] 另一个程序正在使用此文件,进程无法访问。: 'C:\\****\\._____temp\\iic\\speech_fsmn_vad_zh-cn-16k-common-pytorch\\README.md'
Downloading: 100%|█████████████████████| 10.6k/10.6k [00:00<00:00, 30.3kB/s]
2024-07-13 16:23:52,760 - modelscope - ERROR - File C****\._____temp\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\README.md integrity check failed, expected sha256 signature is 991885cf850e1629de7ce0624d83916e45791cba8049f0e4899477e7837f4f5e, actual is 
concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\****\.conda\envs\FishAudio\lib\concurrent\futures\process.py", line 246, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "D:\DLWorkshop\audio-preprocess\fish_audio_preprocess\utils\transcribe.py", line 41, in batch_transcribe
    model = AutoModel(
  File "C:\Users\****\.conda\envs\FishAudio\lib\site-packages\funasr\auto\auto_model.py", line 134, in __init__
    vad_model, vad_kwargs = self.build_model(**vad_kwargs)
  File "C:\Users\****\.conda\envs\FishAudio\lib\site-packages\funasr\auto\auto_model.py", line 218, in build_model
    assert model_class is not None, f'{kwargs["model"]} is not registered'
AssertionError: fsmn-vad is not registered

简单排查了一下,发现fap transcribe默认启动2个worker:

# cli\transcribe.py line 19
@click.option(
    "--num-workers",
    help="Number of workers to use for processing, defaults to 2",
    default=2,
    show_default=True,
    type=int,
)

这导致,如果用户没有本地下载好模型,则调用的Funasr进程试图同时向modelscope的同一个._____temp文件夹下载模型文件,这导致Windows系统报错:[WinError 32] 另一个程序正在使用此文件。
加上--num-workers 1参数后不再产生此问题。

(FishAudio) PS D:\DLWorkshop\audio-preprocess> fap transcribe .\all_data\sliced\ --model-type funasr --recursive --num-workers 1
2024-07-13 16:25:13.418 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:72 - Using paraformer-zh model for funasr as default
2024-07-13 16:25:13.433 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:80 - Using 1 workers for processing
2024-07-13 16:25:13.433 | INFO     | fish_audio_preprocess.cli.transcribe:transcribe:81 - Transcribing audio files in .\all_data\sliced\
2024-07-13 16:25:17.453 | INFO     | fish_audio_preprocess.utils.transcribe:batch_transcribe:40 - Loading paraformer-zh model for zh transcription
You are using the latest version of funasr-1.1.0
2024-07-13 16:25:18,440 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:25:18,440 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:25:22,084 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:25:22,084 - modelscope - INFO - Use user-specified model revision: master
Downloading: 100%|████████████████| 10.6k/10.6k [00:00<00:00, 29.2kB/s] 

建议: fap transcribe指令的默认--num-workers 改为1,或是新增代码来解决workers的下载冲突问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant