-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NeMo + Lhotse integration #7880
Conversation
jenkins |
ff22273
to
5f11fdb
Compare
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
from nemo.collections.asr.data.audio_to_text_lhotse import LhotseSpeechToTextBpeDataset | ||
from nemo.collections.common.data.lhotse import get_lhotse_dataloader_from_config | ||
|
||
return get_lhotse_dataloader_from_config( |
Check notice
Code scanning / CodeQL
Commented-out code Note
0f9ed2f
to
36a7bda
Compare
36a7bda
to
0f9ed2f
Compare
355f5dc
to
abe3fcb
Compare
abe3fcb
to
1f23eee
Compare
Signed-off-by: Piotr Żelasko <[email protected]>
3394c67
to
252fd43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks quite good to me. There are minor changes required
- Change all copyright header in your PR to 2024
- Move all hard dependency imports to the top of the file and not inside functions or classes
- Move all soft dependency imports under try catch pattern, see numba and apex for reference in NeMo
Final review @VahidooX - from my side this is ok to merge with above fixes
@@ -82,13 +82,15 @@ RUN INSTALL_MSG=$(/bin/bash /tmp/torchaudio_build/scripts/installers/install_tor | |||
|
|||
# install nemo dependencies | |||
WORKDIR /tmp/nemo | |||
ENV LHOTSE_REQUIRE_TORCHAUDIO=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it prevents Lhotse from pulling/checking torchaudio.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the user need to set it when not using the dockers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if they can't have torchaudio installed for some reason. IIUC the issue with torchaudio mostly stems from upstream docker images not having it pre-built and the need to build from source when building NeMo containers (which may fail for a number of reasons). But outside these docker images it's straightforward to pip/conda install torchaudio alongside torch. WDYT?
Dockerfile
Outdated
COPY requirements . | ||
RUN for f in $(ls requirements*.txt); do pip3 install --disable-pip-version-check --no-cache-dir -r $f; done | ||
|
||
# install flash attention | ||
RUN pip install flash-attn | ||
# install numba for latest containers | ||
RUN pip install numba>=0.57.1 | ||
RUN pip install pyloudnorm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove from here, put in requirements for asr.txt
docs/source/asr/datasets.rst
Outdated
model.train_ds.shuffle=true # optional | ||
|
||
# Lhotse dataloading related arguments | ||
+model.train_ds.use_lhotse=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ++ to be safer (in case key already exists in config)
} | ||
|
||
def __init__(self, tokenizer): | ||
from lhotse.dataset import AudioSamples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put these imports in try catch at the top, after the nemo imports. See other examples of import guards.
IMO its better to know what imports are there in a file at the top rather than hunt down imports in file.
""" | ||
|
||
def __init__(self, tokenizer): | ||
from nemo.collections.common.tokenizers.aggregate_tokenizer import AggregateTokenizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these imported here? They arent problematic.
@@ -90,6 +90,17 @@ def __init__(self, cfg: DictConfig, trainer=None): | |||
) | |||
|
|||
def _setup_dataloader_from_config(self, config: Optional[Dict]): | |||
if config.get("use_lhotse"): | |||
from nemo.collections.asr.data.audio_to_text_lhotse import LhotseSpeechToTextBpeDataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put these imports at the top of the file. If they are guarded properly there's no issue in importing the file.
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
from .cutset import read_cutset_from_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No relative imports - please use absolute imports throughout NeMo codebase
""" | ||
logging.info("We will be using a Lhotse DataLoader.") | ||
|
||
from lhotse import CutSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import at top with try catch, check inside function if HAVE_LHOTSE
|
||
try: | ||
from lhotse.lazy import ImitatesDict | ||
except ImportError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
catch both ImportError and ModuleNotFoundError (some python versions raise the latter)
Signed-off-by: Piotr Żelasko <[email protected]>
e3507bc
to
0c31f27
Compare
@titu1994 all suggestions implemented |
Signed-off-by: Piotr Żelasko <[email protected]>
…sko/nemo into feature/lhotse-integration
Signed-off-by: Piotr Żelasko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great, just need @VahidooX's final signoff
Need to pass jenkins though |
jenkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm generally, just some minor comments!
@@ -82,13 +82,15 @@ RUN INSTALL_MSG=$(/bin/bash /tmp/torchaudio_build/scripts/installers/install_tor | |||
|
|||
# install nemo dependencies | |||
WORKDIR /tmp/nemo | |||
ENV LHOTSE_REQUIRE_TORCHAUDIO=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the user need to set it when not using the dockers?
return self._tokenizer(text) | ||
|
||
|
||
def _identity(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you use this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, not needed anymore. I removed it.
Signed-off-by: Piotr Żelasko <[email protected]>
jenkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, thanks for the awesome work !
Thanks guys! Such a joy to click merge after all this work 😀 |
* Lhotse integration squashed PR Signed-off-by: Piotr Żelasko <[email protected]> * Code review - Som Signed-off-by: Piotr Żelasko <[email protected]> * Update copyright headers to 2024 Signed-off-by: Piotr Żelasko <[email protected]> * Fix NLP imports Signed-off-by: Piotr Żelasko <[email protected]> * Code review - Vahid Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]>
* Lhotse integration squashed PR Signed-off-by: Piotr Żelasko <[email protected]> * Code review - Som Signed-off-by: Piotr Żelasko <[email protected]> * Update copyright headers to 2024 Signed-off-by: Piotr Żelasko <[email protected]> * Fix NLP imports Signed-off-by: Piotr Żelasko <[email protected]> * Code review - Vahid Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Sasha Meister <[email protected]>
* Lhotse integration squashed PR Signed-off-by: Piotr Żelasko <[email protected]> * Code review - Som Signed-off-by: Piotr Żelasko <[email protected]> * Update copyright headers to 2024 Signed-off-by: Piotr Żelasko <[email protected]> * Fix NLP imports Signed-off-by: Piotr Żelasko <[email protected]> * Code review - Vahid Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Pablo Garay <[email protected]>
* Lhotse integration squashed PR Signed-off-by: Piotr Żelasko <[email protected]> * Code review - Som Signed-off-by: Piotr Żelasko <[email protected]> * Update copyright headers to 2024 Signed-off-by: Piotr Żelasko <[email protected]> * Fix NLP imports Signed-off-by: Piotr Żelasko <[email protected]> * Code review - Vahid Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]>
What does this PR do ?
This PR adds an option to leverage Lhotse for dataloading in NeMo.
Collection: Currently ASR only, with clear path to extend into other collections.
Changelog
Usage
model.{train,validation,test}_ds
sections can be extended with additional fieldsuse_lhotse: True/False
andlhotse: {...}
to enable this.Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information