Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long-form audio speaker diarization OOM in clustering #7912

Closed
remenberl opened this issue Nov 20, 2023 · 7 comments
Closed

Long-form audio speaker diarization OOM in clustering #7912

remenberl opened this issue Nov 20, 2023 · 7 comments
Assignees
Labels
bug Something isn't working stale

Comments

@remenberl
Copy link

remenberl commented Nov 20, 2023

Hi,

Thanks for the recent development of long-form audio speaker diarization in #7737. Recently I encounter a 4-hour-long audio and observe OOM on RAM (not VRAM).

Steps/Code to reproduce bug
It can be reproduced by using audio https://podwise-hh.s3.us-west-1.amazonaws.com/0ef8d4f5beb504fae9f12272a83db030ab29c92f4e033eb86bfefe3d1668c7cf.m4a

My telephonic config file is just the default, with the clustering/msdd part pasted below:

  clustering:
    parameters:
      oracle_num_speakers: False
      max_num_speakers: 8
      enhanced_count_thres: 80
      max_rp_threshold: 0.25
      sparse_search_volume: 30
      maj_vote_spk_count: False 
      chunk_cluster_count: 50
      embeddings_per_chunk: 10000
  msdd_model:
    model_path: diar_msdd_telephonic
    parameters:
      use_speaker_model_from_ckpt: True 
      infer_batch_size: 25
      sigmoid_threshold: [0.7] 
      seq_eval_mode: False
      split_infer: True
      diar_window_length: 50
      overlap_infer_spk_limit: 5

Expected behavior
The job stops after my screen prints the last iteration of "Extracting embeddings for Diarization" and the program quickly consumes close to 64GB memory from 20GB- in previous steps. FYI, here are the last lines of prints before being killed.

[NeMo I 2023-11-19 20:54:29 clustering_diarizer:343] Extracting embeddings for Diarization
[NeMo I 2023-11-19 20:54:29 collections:445] Filtered duration for loading collection is  0.00 hours.
[NeMo I 2023-11-19 20:54:29 collections:446] Dataset loaded with 52949 items, total duration of  7.25 hours.
[NeMo I 2023-11-19 20:54:29 collections:448] # 52949 files loaded accounting to # 1 labels

Environment overview (please complete the following information)

  • Environment location: Ubuntu 22.04 with 64GB RAM, 4090 GPU
  • Method of NeMo install: python -m pip install git+https://github.com/NVIDIA/NeMo.git@main#egg=nemo_toolkit[asr]
  • If method of install is [Docker], provide docker pull & docker run commands used

Environment details

If NVIDIA docker image is used you don't need to specify these.
Otherwise, please provide:

  • OS version: Ubuntu 22.04
  • PyTorch version: '2.0.1+cu117'
  • Python version: 3.10

Additional context

Add any other context about the problem here.
Example: GPU model

@remenberl remenberl added the bug Something isn't working label Nov 20, 2023
@nithinraok
Copy link
Collaborator

Thanks for raising this issue, could you please attach full log here.

@remenberl
Copy link
Author

remenberl commented Nov 21, 2023

Attached the log from running NeuralDiarizer.

nemo.log

@erikqu
Copy link

erikqu commented Dec 5, 2023

I'm also having this issue, hangs at this step, similar settings as op. Versions: Nemo 1.21.0, python 3.10

[NeMo I 2023-12-05 18:46:27 collections:302] Dataset loaded with 43 items, total duration of  0.60 hours.
[NeMo I 2023-12-05 18:46:27 collections:304] # 43 files loaded accounting to # 1 labels
vad: 100%|██████████| 43/43 [00:07<00:00,  6.06it/s]
[NeMo I 2023-12-05 18:46:34 clustering_diarizer:250] Generating predictions with overlapping input segments

@tango4j
Copy link
Collaborator

tango4j commented Dec 6, 2023

@remenberl
Thank you for uploading the samples. I will test it and get back to you.
Looking into the log you shared, I suppose 64GB RAM is not enough to handle 4 hours of diarization in an offline manner.
I will confirm the RAM requirement for this sample after run this myself.

@erikqu
Copy link

erikqu commented Dec 13, 2023

This worked for me after building from source, and setting num_workers to 0

Copy link
Contributor

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Jan 13, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

4 participants