Skip to content

Meeting Notes

Kelly Davis edited this page Aug 10, 2020 · 767 revisions

DeepSpeech Weekly

Agenda (10/08/2020)

  • Discussion

    • STT 0.8.1?
    • STT 1.0.0
  • Review of on-going work

    • Abhishek
      • Research on handling the markups
    • Eren
      • Discourse and GitHub issues
      • Glow TTS experiments
      • Merging multi-speaker model
    • Alex
      • CI cleanup
      • Dealing with Tensorflow bug
      • Renaming!!!
    • Reuben (PTO)
    • Tilman
      • Evaluation of newest augmentation runs
      • Librivox Data set
      • Documentation!
    • Kelly
      • Mozilla Machine Learning WebSite
      • Bergamot
        • Preparing finances for the auditor
          • Following up on financial consultant's requested changes
          • Following up on Mozilla EU finance chief's requested changes
          • Preparing time-sheets for myself for the financial consultant, Mozilla EU finance chief, auditor, and EU
          • Preparing receipts for all Bergamot purchases for the last 18 months for the financial consultant, Mozilla EU finance chief, auditor, and EU
        • Getting legal/finance to (finally) pay the graphic designer
      • Te Hiku Media
        • Writing the SoW for the contract legal setup to accept payment for STT/TTS work
      • OpenWRT
        • Following up with legal and BizDev on work with OpenWRT
        • Meeting with BizDev and OpenWRT tomorrow
    • Rosana
      • Mozilla Voice WebSite development
      • Product research

Agenda (03/08/2020)

  • Discussion

    • STT 0.8.0!
    • STT 1.0.0
  • Review of on-going work

    • Abhishek
      • Research on handling the markups
    • Eren
      • Discourse and GitHub issues
      • Glow TTS experiments
      • Documentation + Posts for 0.1.0
    • Alex
      • CI cleanup
      • Dealing with Tensorflow bug
    • Reuben
      • Test of renaming
      • Updating microphone streaming example
    • Tilman
      • Evaluation of newest augmentation runs
      • Bug fixes in Snakepit client
      • Documentation!
    • Kelly
      • Mozilla Machine Learning WebSite
      • W3C Workshop on Web and Machine Learning
        • Slides for Virtual W3C Workshop on Web and Machine Learning lecture
        • Recording viddo of lecure for the W3C Workshop on Web and Machine
      • Bergamot
        • Preparing finances for the auditor
          • Getting our financial consultant to review the finances
          • Getting the Mozilla EU finance chief to review the finances
          • Following up on the inevitable changes requested
        • Getting legal/finance to (finally) pay the graphic designer
      • Te Hiku Media
        • Perfecting the contract legal setup to accept payment for STT/TTS work
      • OpenWRT
        • Following up with legal and BizDev on work with OpenWRT
        • Setting up meeting with BizDev and OpenWRT
    • Rosana
      • Mozilla Voice WebSite development
      • Product research

Agenda (27/07/2020)

  • Discussion

  • Review of on-going work

    • Abhishek
      • Initial Integration of Outbound Translation
    • Eren
      • Posts for the TTS 0.1.0 release
      • Normalizing flow
      • Doumentation
    • Alex
      • msys2 fix
      • Started training on 1040h of French
      • 0.8.0-alpha.7 merging
    • Reuben
      • Android/TFLite tutorial
      • Release announcement post
    • Tilman (PTO)
      • ...
    • Kelly
      • Foundation for the National Institutes of Health (FNIH)
        • Trying to setup a new meeting with FNIH
      • OpenWRT
        • Following up with legal nd BizDev on work with OpenWRT
      • Bergamot
        • Milestone for the end of this month
      • SIFIS*HOME H2020 Grant
        • Continuing the process of getting over the legal hurdles of the grant
    • Rosana
      • Mozilla Voice WebSite development
      • Product research

Agenda (20/07/2020)

  • Discussion

    • STT 0.8.0!
      • iOS CI!
      • Remaining iOS issues
      • .NET Can not import DeepSpeech NuGet package
  • Review of on-going work

    • Abhishek
      • Tested Bergamot Rest server integration with Firefox on Mac machine
      • Global Data Privacy training
      • Initial investigation of word alignment information returned by Bergamot rest-server
    • Eren
      • Implementing parallel wavegan
      • Trained a parallel wavegan model (High quality but a bit slow)
      • Refactoring TTS repo to be more python like
    • Alex
      • Butter Fuss!
      • Review iOS PR
      • cuDNN mystery
      • Finished Docker work!
    • Reuben
      • iOS CI!
      • Writing Android app + DS TFLite + microphone streaming tutorial for 1.0
    • Tilman
      • Training 1.0.0 model
      • Evaluation 1.0.0 trainings
      • Common Voice imports
    • Kelly
      • Reworking for the ∞th time developer OKR's with David on his request
      • Foundation for the National Institutes of Health (FNIH)
        • Reviewing proposal by FNIH to join a grant working on collecting/opening dysarthria speech
        • Trying to setup a new meeting with FNIH
      • LANGEQ-2020 Proposal
        • Talked with Bangor about a LANGEQ-2020 coalition
        • Started the process of applying for a LANGEQ-2020 grant to fund CV + DS work
        • Started on writing the proposal
      • OpenWRT
        • Met with OpenWRT to discuss Deep Speech integration
        • Following up with legal nd BizDev on work with OpenWRT
        • Trying to read the tea leaves of Mozilla history with OpenWRT
      • Bergamot
        • Packaging server + model(s) as macOS application!
        • Creating demo for Bergamot 18th Month Review
        • Creating Demo and Integration Presentation for Bergamot 18th Month Review
        • Creating Dissemination and Exploitation Presentation for Bergamot 18th Month Review
        • Writing Bergamot Periodic Report for Bergamot 18th Month Review
      • SIFIS*HOME H2020 Grant
        • Continuing the process of getting over the legal hurdles of the grant
    • Rosana (PTO)
      • Mozilla Voice WebSite development
      • Product research

Agenda (13/07/2020)

  • Discussion

    • STT 0.8.0!
      • iOS CI
      • Docker support, building and publishing
  • Review of on-going work

    • Abhishek
      • Make machine translation from Firefox work with the quality estimation code
    • Eren
      • Implementing parallel wavegan
      • Training a (parallel wavegan) model
      • Working on tflite export of models
      • Working with contributors on multi-voice model
    • Alex (Can't Make Meeting)
      • Docker support, building and publishing
    • Reuben
      • iOS CI
      • 0.8.0 documentation
    • Tilman (PTO)
      • Training 1.0.0 model
    • Kelly
      • Reworking for the ∞th time developer OKR's with David on his request
      • Foundation for the National Institutes of Health (FNIH)
        • Met with FNIH to discuss inclusion of dysarthria speech into Common Voice
        • Reviewing proposal by FNIH to join a grant working on collecting/opening dysarthria speech
      • LANGEQ-2020 Proposal
        • Talked with Bangor about a LANGEQ-2020 coalition
        • Started the process of applying for a LANGEQ-2020 grant to fund CV + DS work
      • OpenWRT
        • Met with OpenWRT to discuss Deep Speech integration
        • Following up with legal nd BizDev on work with OpenWRT
        • Trying to read the tea leaves of Mozilla history with OpenWRT
        • Setting up a meeting with Kathy on Mozilla's history with OpenWRT
        • Setting up a meeting with David on Mozilla's history with OpenWRT
      • Bergamot
        • Packaging server + model(s) as macOS application
        • Creating 8-bit models from the pre-trained student models
        • Updating the website with pointers to released NMT models
      • SIFIS*HOME H2020 Grant
        • Continuing the process of getting over the legal hurdles of the grant
    • Rosana
      • FxR UX for model integration done
      • Mozilla Voice WebSite development
      • Product research

Agenda (06/07/2020)

  • Discussion

    • STT 0.8.0
  • Review of on-going work

    • Abhishek
      • Make machine translation from Firefox work with the quality estimation code
    • Eren (PTO)
    • Alex
      • KaiOS xpcshell work
      • Support on discourse
      • Initial delagate support
      • Docker support, building and publishing
      • Updating deepspeech-server to 0.7.X
      • Prep for Ministry of Finance meeting tomorrow
    • Reuben
      • Training Mandarin model
      • Inference time measurements for UTF-8 on laptop + phone
      • Making new master alpha
      • iOS target
      • 0.8.0 documentation
    • Tilman
      • DSAlign documentation
      • Upgrading DSAlign to DeepSpeech 0.7.X
      • Testing transcribe.py
      • Fixing importers alphabet problem
      • CV imports
    • Kelly
      • Bergamot
        • Packaging server + model(s) as macOS application
        • Starting integration of student teacher scripts
        • Starting integration of 8-bit scripts
        • Creating 8-bit models from the pre-trained student models
        • Updating the website with pointers to released NMT models
      • W3C Workshop on Web and Machine Learning
        • Reviewing workshop proposals
      • NVIDIA + DSAlign's LibriVox data set
        • Syncing with NVIDIA on LibriVox release + press
        • Syncing with legal on license for LibriVox release
      • SIFIS*HOME H2020 Grant
        • Continuing the process of getting over the legal hurdles of the grant
    • Rosana
      • FxR UX for model integration done
      • Mozilla Voice WebSite development
      • Product research

Agenda (29/06/2020)

  • Discussion

    • STT 1.0.0 training
  • Review of on-going work

    • Rosana
      • Working on product opourtunities research
      • Working on model backend host
    • Abhishek
      • Reproducing Firefox's machine translation workflow with newer marian-server
      • Make machine translation from Firefox work with the quality estimation code
    • Eren (PTO)
    • Alex
      • TF submodule
      • KaiOS xpcshell work
    • Reuben
      • C++ generate_scorer_package
      • MSYS2 issue
      • iOS shared library signing
      • Training Mandarin model
    • Tilman
      • Restarting trainings
      • DSAlign documentation
      • Upgrading DSAlign to DeepSpeech 0.7.X
    • Kelly
      • Bergamot
        • Adressing review comments on deliverable D6.2 Mozilla Cluster Integration
        • Adressing review comments on deliverable D7.3 First Dissemination Report
        • Starting integration of student teacher scripts
        • Starting integration of 8-bit scripts
        • Creating 8-bit models from the pre-trained student models
      • W3C Workshop on Web and Machine Learning
        • Recruiting speakers for the workshop
      • NVIDIA + DSAlign's LibriVox data set
        • Syncing with NVIDIA on LibriVox release + press
        • Syncing with legal on license for LibriVox release
      • SIFIS*HOME H2020 Grant
        • Getting CA signed
        • Continuing the process of getting over the legal hurdles of the grant

Agenda (22/06/2020)

  • Discussion

    • 1.0.0 timeline
    • STT 1.0.0 training
  • Review of on-going work

    • Abhishek
      • Reproducing Firefox's machine translation workflow with older marian-server
      • Make machine translation from Firefox work with new marian-server
    • Eren
      • MelGAN's training
      • PWGAN implementation
      • FastSpeech implementation
      • Writing 'Double Decoder Consistency' blog post
    • Alex
      • Dockerfile updating
      • KaiOS xpcshell continuation
    • Reuben
      • Build libdeepspeech.so for iOS with TF 2.2
      • Training Mandarin model
    • Tilman
      • Benchmarking augmentation
      • Other minor fixes
    • Kelly
      • Mozilla Voice web site reviews + comms
      • Preparing to give presentation for ET at Mozilla weekly meeting
      • Deep Speech 1.0.0
        • Reviewing for the Nth time the Mozilla Voice website texts
        • Met with Alex over Comms planning
        • Reviewing issues for 1.0.0 project on GitHub
      • Bergamot
        • Wrote Deliverable D7.3 First Dissemination Report
        • Dealing with reviews of Deliverable D6.2 Mozilla Cluster Integration
        • Starting integration of student teacher scripts
        • Starting integration of 8-bit scripts
      • SIFIS*HOME H2020 Grant
        • Getting CA signed
        • Continuing the process of getting over the legal hurdles of the grant

Agenda (15/06/2020)

  • Discussion

    • Rosana
    • Te Hiku Media contracted research
    • STT 0.8.0
    • STT 1.0.0 + TTS 0.1.0 what's to be done
  • Review of on-going work

    • Abhishek
      • Reviewing Marian Seq2Seq framework
      • Bergamot project plan clarifications
      • Reading "Visualizing A Neural Machine Translation Model"
    • Eren
      • Make decoder and attention masking optional
      • Getting Travis happy
      • Multi-GPU training to vocoder module
      • Writing 'Double Decoder Consistency' blog post
    • Alex
      • Traininig with Mineco dataset
      • Discourse support
      • Dockerfile split
      • TF 2.2 rebase
      • Common Voice interviews
    • Reuben
      • Helping Andre with some training
      • Build libdeepspeech.so for iOS with TF 2.2
      • Training Mandarin model
    • Tilman
      • Implementing time limits for time-stretch augmentation
      • Other minor fixes
    • Kelly
      • Mozilla Voice web site reviews
      • Bergamot
        • Writing Deliverable D7.3 First Dissemination Report
        • Starting integration of student teacher scripts
        • Starting integration of 8-bit scripts
        • Started setting up training pipeline for some more efficient models
        • Handling feedback on Deliverable 6.2 Mozilla Cluster Integration
      • SIFIS*HOME H2020 Grant
        • Getting CA signed
        • Continuing the process of getting over the legal hurdles of the grant

Agenda (08/06/2020)

  • Discussion

  • Review of on-going work

    • Abhishek
      • Read paper: "Attention is all you need"
      • Went through the Bergamot Project plan
      • Reviewing Marian Seq2Seq framework
    • Eren
      • Implementing a Vocoder module for TTS
      • Training models
      • Learning rate scheduling
      • Implementing forward TTS models
      • Writing 'Double Decoder Consistency' blog post
    • Alex
      • TF 2.2 update
      • Learning auditwheel
      • Examining KaiOS status
    • Reuben
      • Adding read-only metrics
      • Making new 0.8 alpha to test PyPI training tests
      • Fixed throwing away of last uneven batch
      • Flip order of VERSION and GRAPH_VERSION symlinks
      • Fixing SWIG wrapper memory leak in decoder package
      • Benchmarking UTF-8 decoding for Alex's grant documentation
    • Tilman
      • Implemented and tested new caching approach
    • Kelly
      • Preparing presentation for Sean on the ML work
      • Bergamot
        • Writing Deliverable 6.2 Mozilla Cluster Integration
        • Writing Deliverable D7.3 First Dissemination Report
        • Reviewing UX/UI work from Edinburgh
        • Starting setup of student teacher traning runs
        • Started setup of the the 8-bit training runs
        • Determining if the quality estimation (QA) work package has models/code
        • Integrating QA model into Firefox, if QA model/code exists
        • Creating work plan for implementation of UI specifications
      • SIFIS*HOME H2020 Grant
        • Continuing the process of getting over the legal hurdles of the grant

Agenda (25/05/2020)

  • Discussion

    • Naming!
  • Review of on-going work

    • Eren
      • Release TTS v 0.0.2!
      • Fix isinf bug
      • Creating enoder module
      • Write a wiki entry for converting Torch to TF
      • Implementing a Vocoder module for TTS
      • Implement MelGAN
      • Writing 'Double Decoder Consistency' blog post
    • Alex
      • Trying to run emulator ourside of B2G build tree
      • Dumping tflite matrices when using NEON / threads
    • Reuben
      • Training a new Mandarin model with latest training code + new validation data
      • Testing Mandarin language models with old and new models
      • 0.8 stuff, docs, TypeScript, PR reviews
    • Tilman
      • Examined augmentation runs
      • Re-factoring augmentation cmd-line handling
      • Looking into where cores are dumped on the cluster
    • Kelly
      • Reading "A Survey of Monte Carlo Tree Search Methods"
      • Preparing presentation for Adam the new COO on the ML work
      • SIFIS*HOME H2020 Grant
        • Working with Ericsson on CA amendments
        • Continuing the process of getting over the legal hurdles of the grant
      • Bergamot
        • Financial planning for the second period
        • Determining if the quality estimation (QA) work package has models/code to estimate quality
        • Integrating QA model into Firefox, if QA model/code exists
        • Review of UI work from work package one
        • Creating work plan for implementation of UI specifications
        • Running several test training runs
        • Started setting up training pipeline for some more efficient models
        • Preparing for the new hire

Agenda (18/05/2020)

  • Discussion

    • Naming
      • When is this due?
      • Who's doing the legal review?
      • Is branding helping with this?
      • Can we intro new packages that point the the old ones?
  • Review of on-going work

    • Eren
      • Checking out ICLR TTS papers
      • Fixing "Attention dies out after 18k iterations quite randomly" issues
      • Automatize TTS
      • Reproduction of MelGAN
    • Alex
      • Makeing the patches for WiFi on Gonk
    • Reuben
      • Improving error handling for scorer loading
    • Tilman
      • Started augmentation runs
      • Augmentation PR
    • Kelly
      • Bergamot
        • Preparing financial statements for the new FTE's for the EU finance group
        • Determining if the quality estimation (QA) work package has models/code to estimate quality
        • Integrating QA model into Firefox, if QA model/code exists
        • Review of UI work from work package one
        • Creating work plan for implementation of UI specifications
        • Running several test training runs
        • Started setting up training pipeline for some more efficient models
        • Preparing for the new hires
      • SIFIS-HOME H2020 Grant
        • Reviewing Ericsson markup of the CA agreement for IP vs FOSS license issues
        • Organizing next all-partner CA meeting
        • Continuing the process of getting over the legal hurdles of the grant
      • Work on organizing the W3C Web + ML conference
      • Reading "A Survey of Monte Carlo Tree Search Methods" [doing]

Agenda (11/05/2020)

  • Discussion

    • 0.7.1
    • Snakepit and submodules
  • Review of on-going work

    • Eren
      • Checking out ICLR TTS papers
      • Fixing "Attention dies out after 18k iterations quite randomly" issues
      • Automatize TTS
      • Reproduction of MelGAN
    • Alex
      • Nodejs/armv7 breakage
      • TF 2.2 for STT runtime (Allows threading in TFLite)
      • Starting KaiOS Work
    • Reuben
      • Transcribed Mandarin validation dat
      • 0.7.1 work
      • Monthly meeting with Bernardo from Iara Health
      • 0.8.0 work (Usage docs, LM docs, NuGet docs, Java docs)
    • Tilman
      • Running tests on augmentation test-training;
      • Preparing augmentation PR for review
      • Considering addition of parameter scheduling
    • Kelly
      • SIFIS-HOME H2020 Grant
        • Reviewing all partners legal's markup of the CA agreement
        • Reviewing Mozilla legal's markup of the CA agreement
        • Meeting with Mozilla's legal later today on CA agreement
        • Organizing next CA meeting
        • Continuing the process of getting over the legal hurdles of the grant
      • Bergamot
        • Review of UI work from work package one
        • Creating work plan for implementation of UI specifications
        • Running several test training runs
        • Started setting up training pipeline
        • Working on Windows TC integrations
        • Preparing for the new hires
      • Reading "Bandit based Monte-Carlo Planning" by Koscic

Agenda (04/05/2020)

  • Discussion

    • KaiOS
  • Review of on-going work

    • Eren
      • Convert Torch Tacotron2 to TF
      • Fixing "Attention dies out after 18k iterations quite randomly"
      • Automatize TTS
      • Reproduction of melgan
      • Melgan with PWGAN loss from scratch 4m iterations
      • Bidirectional Decoding with r=7 in backwards decoder
      • Graves attention after fix of "Attention dies out after 18k iterations quite randomly"
    • Alex
      • Nodejs/armv7 breakage
      • TF 2.2 for STT runtime (Allows threading in TFLite)
      • Starting KaiOS Work
    • Reuben
      • Undoing concatenation of Mandarin samples for transcription
      • 0.7.1 release work
        • Fixing bug in 0.7.0 Node.JS binding
        • Addressing docs pitfalls and other 0.7.0 regressions raised on discourse
        • Adding --candidate_transcripts flag to Python client
        • 0.7.1-alpha.1 release
        • Fixed bug due to poor error handling in DS_EnableExternalScorer
        • Fixing broken package README on NPM
    • Tilman
      • German LibriVox data set
      • NTP server sync of cluster
      • German model building
    • Kelly
      • SIFIS-HOME H2020 Grant
        • Continuing the process of getting over the legal hurdles of the grant
        • Setting up next meeting of partners' legal officers to discuss Coalition Agreement template
      • Bergamot H2020 Grant
        • Running several test training runs
        • Started setting up training pipeline
        • Downloading training data to the server
        • Working on Windows TC integrations
      • Attended meeting on organization of W3C Web & Machine Learning workshop as a virtual conference

Agenda (27/04/2020)

  • Discussion

    • DS 0.7.0 \o/!
    • Naming
    • Sharing LibriVox Data Set on S3
  • Review of on-going work

    • Eren
      • Convert Torch Tacotron2 to TF
      • Automatize TTS
      • Reproduction of melgan
      • Melgan with PWGAN loss from scratch (4m iterations)
      • Finetune Melgan model with TTS spectrograms
      • Bidirectional Decoding with r=7 in backwards decoder
    • Alex
      • Discourse support
      • qm215 investigation
      • TF 2.2 + TFLite
    • Reuben
      • 0.7 release stuff
      • 0.8.0
        • Remove links to docs in GitHub from main README
        • Branching vs. reducing github visibility
      • Upgrade paths to TF 2.x
    • Tilman
      • Tests for augmentation PR
      • Improved augmentation PR
      • German fine-tuning model (WER of 30% on current CV German test-set)
    • Kelly
      • 0.7.0 Release
      • SIFIS-HOME H2020 Grant
        • Met with Finance + Intelligentsia on Indirect Costs
        • Got approval from EC on transfer of Coordinator
        • Continuing the process of getting over the legal hurdles of the grant
        • Setting up a meeting of partners' legal officers to discuss Coalition Agreement template
      • Bergamot H2020 Grant
        • Running several test training runs
        • Started setting up training pipeline
        • Working on Windows TC integrations

Agenda (20/04/2020)

  • Discussion

    • Sharing LibriVox Data Set on S3?
    • DS 0.7.0?
      • LM paramaters
      • Audio duration
      • Add Python 3.7, 3.8 CI coverage
      • Do not use m/mu ABI for Py3.8+
      • M-AILAB importer: Ensure all samples are 16 kHz
      • Add missing external scorer
      • ...
  • Review of on-going work

    • Eren

      • Implemented Tacotron2 with TF
      • Starting the process of passing weights from the Pytorch imp to the TF imp
      • TTS overview document
      • TF 2.0 version revisit
      • Convert Torch Tacotron2 to TF *....
    • Alex

      • Discourse support
      • Mineco data
      • Python version dance
      • Working on Kabyle model
    • Reuben

      • Delayed beam expansion to fix timing bug
      • Looking into VAD/KWS decoder extensions arxiv:1611.09405
      • Prototyping a tool for quick inspection/correction/tagging of samples in the cluster
    • Tilman

      • Tests regarding potential duplicate samples
      • Noise importing
      • Fix for M-AILAB importer
      • Preparing German fine-tuning
      • Packaging Brazilian Portuguese models
    • Kelly

      • Bergamot
        • Working on TC integrations
      • SIFIS-HOME H2020 Grant
        • Creating C-Level presentation on the grant
        • Giving C-Level presentation on the grant
        • Creating estimates for Roxi of "hidden costs"
        • Met with Intelligentsia to discuss the less than rapid progress of Mozilla's C-Levels
        • Meeting with legal to touch base on the SIFIS agreements
        • Trying to get back account setup to accept grant's funds
        • Kicking off the process of getting over the legal hurdles of the grant

Agenda (06/04/2020)

  • Discussion

    • DS 0.7.0?
      • Only allow graph/layer initialization at start of training
      • Pr2876 (TypeScript Support)
      • Transfer-learning docs
      • Rewrite generate_lm.py to allow usage with other languages
      • Default branch the latest stable?
      • ...
  • Review of on-going work

    • Eren

      • Implemented Tacotron2 with TF
      • Starting the process of passing weights from the Pytorch imp to the TF imp
    • Alex

      • TypeScript PR
      • Mentoing Kabyle model
      • Matrix integration with WebThings
      • Intent parsing integrations with WebThings
    • Reuben

      • Prototyping a tool for quick inspection/correction/tagging of samples in the cluster
      • Looking into VAD/KWS decoder extensions
    • Tilman

      • Refactoring (overlay-) augmentation code
    • Kelly

      • Bergamot
        • Reviewing applicants
        • Setting up applicant interviews
        • Giving applicant interviews
        • Working on TC integrations
      • SIFIS-HOME H2020 Grant
        • Trying to get back account setup to accept grant's funds
        • Kicking off the process of getting over the legal hurdles of the grant

Agenda (30/3/2020)

  • Discussion

    • DS 0.7.0?
  • Review of on-going work

    • Eren

      • TTS with TF for 2.1
      • Working with Turkish companies on getting TTS data + using CV for data collection
      • Working with Turkish bank on creating a TTS data set
    • Alex

      • New hardware
      • French Ministerial Transformation Fund data
      • Getting MATRIX Voice and WebThings working together
    • Reuben

      • Prototyping a tool for quick inspection/correction/tagging of samples in the cluster
      • Mandarin data preperation
    • Tilman

      • Updating the cluster
      • Setting up .compute w/cache on worker
      • Training LM from Oscar for PT and DE
    • Kelly

      • Bergamot
        • Reviewing applicants
        • Setting up applicant interviews
        • Working on TC integrations
      • SIFIS-HOME H2020 Grant
        • Submitting Mz Denmark Gmbh's un-audited financial for 2017-2019 to the EC
        • Trying to get back account setup to accept grant's funds
        • Kicking off the process of getting over the legal hurdles of the grant
        • Prepare Annex A and B of the grant

Agenda (23/3/2020)

  • Discussion

    • How's it going?
    • DS 0.7.0?
  • Review of on-going work

    • Eren

      • German TTS voice
      • Working with Turkish TTS comapanies
      • TTS with TF for 2.1 (Better this time?)
    • Alex

      • Getting MATRIX Voice and WebThings working together
    • Reuben

      • Prototyping a tool for quick inspection/correction/tagging of samples in the cluster
      • QuartzNet explorations
    • Tilman

      • Experiments around utf8
      • Training/benchmarking SDB's
      • Working on unlabeled samples PR (Extension of 2622)
    • Kelly

      • Bergamot
        • Reviewing applicants
        • Setting up applicant interviews
        • Working on TC integrations
      • SIFIS-HOME H2020 Grant
        • Submitting Mz Denmark Gmbh's un-audited financial for 2017-2019 to the EC
        • Trying to get back account setup to accept grant's funds
        • Kicking off the process of getting over the legal hurdles of the grant
        • Prepare Annex A and B of the grant
        • Continuing with grant?
      • TTS Voice negotiations
      • Continuing discussions on IT's support of the DS server

Agenda (16/3/2020)

  • Discussion

    • How's home treating you?
  • Review of on-going work

    • Eren

      • German TTS voice
      • Boosting learners
      • Refactoring TTS audio processing
      • Implementing mean-var normalization in dev
    • Alex

      • Getting Win GPU instance ready
      • Getting MATRIX Voice and WebThings working together
    • Reuben [PTO]

    • Tilman

      • Fixed/ing some issues with Oscar LM tool
      • Training/benchmarking SDB's
      • Starting to look at noise augmentation
    • Kelly

      • Bergamot
        • Reviewing applicants
        • Setting up applicant interviews
        • Working on TC integrations
      • SIFIS-HOME H2020 Grant
        • Submitting Mz Denmark Gmbh's un-audited financial for 2017-2019 to the EC
        • Trying to get back account setup to accept grant's funds
        • Kicking off the process of getting over the legal hurdles of the grant
        • Prepare Annex A and B of the grant
      • TTS Voice negotiations
      • Internal "sell" on STT SaaS service
      • Continuing discussions on IT's support of the DS server

Agenda (9/3/2020)

  • Discussion

    • Weekly Update Test
  • Review of on-going work

    • Eren

      • German dataset
      • German model
      • German vocoder
      • Working on TTS normalization
      • Implemented guided attention + training a model with it
    • Alex

      • French model
      • Getting MATRIX Voice and WebThings working together
    • Reuben [PTO]

    • Tilman

      • Re-exporting English LibriVox to SDB
      • Creating German LM from Oscar
      • Small SDB improvements
    • Kelly

      • Reviewing Bergamot applicants again
      • Interviewing Bergamot applicants again
      • Preparing for Bergamot (remote) work week this week, was going to travel
      • Created optimizer built on optuna to optimize lm_alpha + lm_beta
      • Reading: "Concentration-of-measure Inequalities" from Lugosi
      • Continuing discussions on IT's support of the DS server
      • Met with private.ai to discuss privacy preserving training
      • Setting up TTS to read the Mozilla weekly update

Agenda (2/3/2020)

  • Discussion

    • Journal Club or Berlin ML Seminar?
      • Berlin ML seminar is on "Modelling of non-linear state space systems using deep neural network"[1]
  • Review of on-going work

    • Eren

      • MelGan
      • Guided attention
      • German dataset
    • Alex

      • French model
      • Getting MATRIX Voice and WebThings working together
      • Generalized validate_label, i.e. local specific
    • Reuben [PTO]

    • Tilman

      • First test training with LibriVox SDB English export
      • Creating German LM from Oscar
      • Small SDB improvements
    • Kelly

      • Reviewing Bergamot applicants again
      • Interviewing Bergamot applicants again
      • Created optimizer built on optuna to optimize lm_alpha + lm_beta
      • Ran optimizer of lm_alpha + lm_beta on dev set + lowest loss model, got WER 5.97 on LibriSpeech clean test
      • Reading: "Concentration-of-measure Inequalities" from Lugosi
      • Continuing discussions on legal aspects of product data sets
      • Continuing discussions on IT's support of the DS server

Agenda (24/2/2020)

  • Discussion

  • Review of on-going work

    • Eren

      • Batching server
      • Transferring all audio processing to Pytorch
    • Alex

      • Making CI fast!
      • pyenv PR improvement
      • Android emulator and gradle parts
    • Reuben

      • API changes for DeepSpeech v1.0
      • Analyzing the speech proxy data dump
      • QuartzNet explorations
    • Tilman

      • Refactoring DSAlign exporter to consume less memory and run faster
    • Kelly

      • Reviewing Bergamot applicants again
      • Creating optimizer built on optuna to optimize lm_alpha + lm_beta
      • Running optimizer lm_alpha + lm_beta on lowest loss model
      • Reading: "Concentration-of-measure Inequalities" from Lugosi
      • Kick starting discussions on legal aspects of product data sets
      • Kick starting discussions on IT's support of the DS server

Agenda (17/2/2020)

  • Discussion

    • KR's
  • Review of on-going work

    • Eren

      • Punctuation
      • Dealing with the german dataset
    • Alex

      • Landing node-gyp cache
      • Trying to get local SWIG build working w/TC
      • Working on Matrix on RPi4 with DeepSpeech and WebThings
    • Reuben

      • Analyzing the speech proxy data dump
      • Landing transfer learning PR
      • Reviewing top_paths != 1 PR
    • Tilman

      • Worked on catalog combiner
      • Worked on export.py of DSAlign
    • Kelly

      • Creating optimizer built on optuna to optimize lm_alpha + lm_beta
      • Working on Mozilla ML Web Site
      • Continued training English models on the 6000's
      • Revising KR's
      • Bergamot
        • Reviewing applicants again
        • Working with graphics designer on poster/flyer design
        • Working on TC integrations

Agenda (10/2/2020)

  • Review of on-going work
    • Eren

      • Released the best LJSpeech TTS and PWGAN model
      • Writing examples for TTS
      • Working with German TTS talent to fix some examples + add data
      • Add Decouples Linear Loss to Dev branch
      • Pytorch based STFT
      • Train only with log domain mel w/o any further preprocessing
      • Train with larger discriminator as in
    • Alex

      • Landing node-gyp cache
      • Trying to get local SWIG build working w/TC
      • Working on Matrix on RPi4 with DeepSpeech and WebThings
    • Reuben

      • API stabilization for DeepSpeech v1 (Single file packaging for LM, model packaging...)
      • Analyzing the speech proxy data dump
      • Analyzing DeepSpeech errors in Firefox Voice
      • Create Mandarin LMs from OSCAR
      • Gathering and sending DanSpeech student project ideas
    • Tilman

      • SDB PR follow-up
      • Implemented SDB export in DSAlign
    • Kelly

      • Review PR 2723
      • Training English models on the 6000's
      • Benchmarking pre MFCC fix English models on the 6000's
      • Talking with new finance about the EU grant
      • Working on Mozilla India Voice Strategy
      • Working on Mozilla+GIZ Call for Proposals
      • Working on getting auditor's declaration to EU on signed

Agenda (3/2/2020)

  • Discussion

    • All-Hands
    • DeepSpeech model installation UX
  • Review of on-going work

    • Eren

      • Mel-GAN experiments
      • PWGAN experiments (Train with larger discriminator, No normalization only db conversion...)
      • TTS (Pytorch based STFT, Train only with log domain mel...)
    • Alex

      • Landing CTC decoder checks
      • Rebasing French docker to current 0.7.0a1 + Running training to verify no regression
      • Playing with MATRIX Voice
    • Reuben

      • API stabilization for DeepSpeech v1 (Single file packaging for LM, model packaging...)
      • Embed beam width into model and simplified CreateModel API
      • Analyzing DeepSpeech errors in Firefox Voice
      • Create Mandarin LMs from OSCAR
      • QuartzNet explorations
      • Client library for TTS server
    • Tilman (PTO)

    • Kelly

      • Some "BizDev" for Petpooja
      • Bergamot travel/conference planning
      • Training English models on the 6000's
      • Talking with new finance about the EU grant
      • Working on getting auditor's declaration to EU on signed

Agenda (20/1/2020)

  • Announcements

    • Internal Q+A
    • Objectives
    • All-Hands
      • Demos
      • Ligtning talks
  • Review of on-going work

    • Eren (PTO)

    • Alex

      • Discourse/GitHub support
      • Updating WebSpeech API code
      • Cross-compiled KenLM for RPI to dynamically build on-device LM (Cool!)
    • Reuben

      • Single file packaging for LM
      • API changes as a result of single file packaging for LM
    • Tilman

      • SDB file format (Finished coding on external sorting)
    • Kelly

      • Bergamot demo (Compiling + running server on cluster)
      • Rasa demo (Getting initial version running and modifying it)
      • Getting acoustic office booth for All-Hands (Solidifying shipping date + movers)
      • Monthly Bergamot Meeting
      • Getting Intelligentsia Consultants paid
      • Getting new contract signed for auditor

Agenda (13/1/2020)

  • Announcements

    • Deep Speech 0.6.1 out!
    • Copenhagen Report
    • Hyperparameter tuning
    • Deep Speech 1.0.0
  • Review of on-going work

    • Eren

      • Fast Speech (Attention to Feedforward)
      • PWGAN experiments (CPU realtime GPU much faster than realtime)
      • Post-Net/Decoder decoupling
      • Mel-GAN experiments
      • Forward attention and Tacotron2 experiments
      • DeepSpeech with TF2
    • Alex

      • Discourse/GitHub support
      • Cross-compiled KenLM for RPI to dynamically build on-device LM (Cool!)
    • Reuben

      • 0.6.1 out the door
      • Create Mandarin LMs from OSCAR
      • Experimenting with different LM sizes
      • Experimenting with QuartzNet
      • Training Mandarin models
      • TensorFlow 1.15 update
    • Tilman

      • Continued work on automatic sourcing of Librivox data
        • Calibre based tool to go from formatted book to plain text
        • Started a run server, currently aligned 27K hours of English!
      • SDB file format
    • Kelly

      • Getting acoustic office booth for All-Hands
      • Setting up Firefox w/Bergamot tranlation engine again for All-Hands demo
      • Planning for meeting Edinburgh Group in Berlin before All-Hands for demo work
      • Proof reading 1E9 interview
      • Training English models on the 6000's (Optimizing dropout, alpha, and beta)
      • Figuring out legalities of billing + hiring for the EU grant

Agenda (6/1/2020)

  • Announcements

    • Welcome back!
    • 0.6.1?
  • Review of on-going work

    • Eren

      • Fast Speech (Attention to Feedforward)
      • PWGAN experiments (CPU realtime GPU much faster than realtime)
      • Post-Net/Decoder decoupling
      • Mel-GAN experiments
      • Forward attention and Tacotron2 experiments
      • DeepSpeech with TF2
    • Alex

      • Discourse/GitHub support
      • Cross-compiling KenLM for RPI to dynamically build on-device LM (Cool!)
    • Reuben

    • Tilman

      • Continued work on automatic sourcing of Librivox data
        • More reliable ZIM indexing/book-fetching
        • Calibre based tool to go from formatted book to plain text
        • Started a run server, currently aligned 27K hours of English!
      • Working on experimental data format
    • Kelly

      • Nothing useful, answering emails all day (Inbox was over 500 now is at 34)

Agenda (9/12/2019)

  • Announcements

    • Kelly at NeurIPS
  • Review of on-going work

    • Eren

      • Picking things back up after vacation
      • Inspecting experiments that finished over vacation
      • Working around sshfs problems by spawning a separate mount per experiment
      • Comparing vocoder architectures
      • Talking to some companies that are interested in TTS collaboration
    • Alex

      • Talked at Open Source conference
        • Got some feedback from people integrating DeepSpeech
        • Talk was well received
        • Talked to people working integrating DeepSpeech French model for automatic video transcripts
        • Collaboration between several universities
        • Following up on conversations
      • Firefox DeepSpeech process isolation
        • Need to get in touch with maintainer
        • Check assumptions on what's doable/acceptable
    • Reuben

      • Fixing some problems that popped up in the 0.6 release
        • Moving examples off-repo
        • Publishing TFLite package on Linux/Windows/macOS
        • Using simpler README in online package listings
        • Following up on feedback from release
        • Talked to Ryan Hileman from https://talonvoice.com/
        • He mentioned that using all of Common Voice valid.tsv lead to significant improvements for people with non-US accents
        • We should experiment with using all of valid (subtracting dev/test) to see if it helps
      • Experimenting with QuartzNet https://arxiv.org/abs/1910.10261
    • Tilman

      • Continued work on automatic sourcing of Librivox data
        • More reliable ZIM indexing/book-fetching
        • Calibre based tool to go from formatted book to plain text
        • Started a basic run on home box to test whole setup end-to-end
        • Modulo export step, everything seems to be working
        • Can improve throughput by parallelizing downloads
    • Kelly

      • At NeurIPS

Agenda (2/12/2019)

  • Announcements

    • Release 0.6.0 TODO's
      • Blog post [Reuben]
      • Partner email [Kelly]
      • Benchmark models [Kelly]
      • Release notes [Reuben/Kelly]
  • Review of on-going work

    • Eren (PTO)

    • Alex

      • Working on presentation for Open Source conference
      • Firefox DeepSpeech process isolation
      • Meetup on Wed w/Deep Speech + Video Streaming integration group
    • Reuben

      • Finished awesome TTS packaging
      • Blog post about DeepSpeech 0.6.0
      • Improving word timing demo
      • Somewhat stable API for TTS models
      • Obtained remainder of the OI's 1950 hours of Mandarin
    • Tilman

      • Re-Export of NPR data
      • Continued work on LibriVox sourcing code
        • More reliable ZIM indexing/book-fetching
        • Calibre based tool to go from formatted book to plain text
    • Kelly

      • IGF Conference
      • LREC 2020 paper on Common Voice
      • ACL 2020 paper on Interpreting Contextualized Representations via Static Embedding Analysis
      • Continuing talks with our voice talent over public use of her voice data
      • Getting contract for a German Voice talent signed
      • Getting Intelligentsia Consultants paid for their work on the new IoT H2020 grant
      • Preparing talk for LT4All Conference
      • Denmark GovTech-Program meeting
      • Deep Speech 0.6.0
        • Training, and benchmarking final 0.6.0 model
        • Readthedocs documentation
        • Wrote partner email + gathered partner emails

Agenda (25/11/2019)

  • Announcements

  • Review of on-going work

    • Eren (PTO)

    • Alex

      • MozIoT addon for STT
      • Updating WebSpeech API patches
      • Starting work on RemoteDataDecoder
    • Reuben

      • API for TTS models
        • Making REST server easy to use via wheel packaging
        • Testing TTS wheel on Linux
        • Writing instructions
      • Blog post about DeepSpeech 0.6.0
      • Obtained remainder of the OI's 1950 hours of Mandarin
    • Tilman

      • Improving DSAlign
        • Getting better alignment logging
        • Getting (speaker) meta-data
    • Kelly

      • Gave LPSS keynote
      • Kicking off talks with our voice talent over public use of her voice data
      • Getting first draft of contract for a German Voice talent reviewed by legal + voice talent
      • Getting Intelligentsia Consultants paid for their work on the new IoT H2020 grant
      • Working with Diane Tate on All-Hands demo admin
      • Preparing talk for LT4All Conference
      • Training, and eventually benchmarking on many data sets, models for 0.6.0

Agenda (18/11/2019)

  • Announcements

    • All-Hands demos
      • Rasa integration of STT + TTS (English) [Kelly]
      • NMT integration into Firefox (German-to-English) [Kelly]
      • WebThings Deep Speech integration (English) [Alex]
      • STT (German) [Tilman]
      • Standard TTS demo showing quality w/BERT generator (English) [Eren]
      • STT in Firefox (English + client-server) [Alex]
      • STT in MR (English) [Alex]
    • Release 0.6.0 TODO's
      • Blog post [Reuben]
      • Release notes [All]
      • Language model from new corpus [Reuben]
      • Train models using latest Common Voice [Kelly]
      • Benchmark model using latest Common Voice [Kelly/Reuben]
    • Deep Speech 1.0
  • Review of on-going work

    • Eren

      • TTS developer support
      • Updating/fixing colab example
      • Merge Zoneout and new Forward attention to dev branch
      • Check incompatibility of PWGAN results wrt ESPNet
      • New audio parameters w/PWGAN experiments
      • Use Zoneout instead of Dropout w/Forward attention and Tacotron2 experiments
      • Implement Guided Attention
    • Alex

      • Fixing french commonvoice/sentence collector dataset
      • Webspeech API (temp) patch for noise cancellation
      • WebThings integrations with Deep Speech
      • Spoke at Toulouse's Capitole du Libre
    • Reuben

      • Experimenting with OpenWebText corpus for LM
      • Blog post about DeepSpeech 0.6.0
      • Experiments with Mandarin data from OI (Test model quality)
      • Optimizing UTF-8 training code
    • Tilman

      • Landing transcribe.py
      • Working on alignment-automation
    • Kelly

      • Training 0.6.0 model (Low learning rates with high droputs, smooth convergence)
      • Preparing talk for LPSS
      • Organizing CV demo for LT4All conference
      • New H2020 IoT grant legwork (Writing proposal, due November 19th)
      • New H2020 grant proposal "AIZEN" submitted
      • Flying to Taipei tomorrow

Agenda (11/11/2019)

  • Announcements

    • All-Hands demos
      • Rasa integration of STT + TTS (English) [Kelly]
      • NMT integration into Firefox (German-to-English) [Kelly]
      • WebThings Deep Speech integration (English) [Alex]
      • STT (German) [Tilman]
      • Standard TTS demo showing quality w/BERT generator (English) [Eren]
      • STT in Firefox (English + client-server) [Alex]
      • STT in MR (English) [Alex]
    • Release 0.6.0 TODO's
      • Blog post [Reuben]
      • Release notes [All]
      • Language model from new corpus [Reuben]
      • Train model using latest Common Voice [Kelly]
      • Benchmark model using latest Common Voice [Kelly/Reuben]
    • Deep Speech 1.0
  • Review of on-going work

    • Eren

      • Training a Parallel WaveGAN model
      • Forward backward decoder training
      • Looking again at Location-Relative Attention Mechanisms
    • Alex

      • Preparing for talk next week at Toulouse's Capitole du Libre
      • TFLite representative data set optimization does not optimize
    • Reuben

      • Landing UTF-8 changes (Replacing current character-based mode with UTF-8 mode)
      • Experimenting with OpenWebText corpus for LM
      • Blog post about DeepSpeech 0.6.0
      • Experiments with Mandarin data from OI (Test model quality)
      • Optimizing UTF-8 training code
    • Tilman

      • Building German LM
      • Imported all German data to server
      • Preping for test epoch of all German data
      • Uplifting transcribe.py to work with calatog files
    • Kelly

      • Training 0.6.0 model
      • Preparing talk for LPSS
      • Bergamot WebSite (Re-working the partner page)
      • Organizing CV demo for LT4All conference
      • New H2020 IoT grant legwork (Writing proposal, due November 19th)
      • New H2020 grant proposal "AIZEN" (Writing proposal, due November 19th)

Agenda (04/11/2019)

  • Announcements

    • All-Hands demos?
    • Release 0.6.0 TODO's
      • Blog post
      • Release notes
      • Language model from new corpus?
      • Train model using latest Common Voice
      • Benchmark model using latest Common Voice
    • 1.0 Before All-Hands?
      • Downloader for models?
      • Downloader allows for 3rd parties?
      • ...
  • Review of on-going work

    • Eren

      • Evaluating different TTS architectures (N x Conv Encoder for different N)
      • Fine tuning with a small corpus (Attention fails)
      • Implementing TTS in TF 2.0 for deployment (TF2 needs more to make TTS on par with pytorch.)
      • Forward backward decoder training
      • Looking again at Location-Relative Attention Mechanisms
    • Alex

      • Taskcluster migration (Surprise!)
      • TFLite representative data set optimization does not optimize
    • Reuben

      • Landing UTF-8 changes (Replacing current character-based mode with UTF-8 mode)
      • Embedded alphabet in model!
      • Experimenting with OpenWebText corpus for LM
      • Blog post about DeepSpeech 0.6.0
      • Experiments with Mandarin data from OI (Test model quality)
      • Optimizing UTF-8 training code
    • Tilman

      • Imported all German data to server
      • Preping for test epoch of all German data
      • Building German LM
      • Uplifting transcribe.py to work with calatog files
    • Kelly

      • New H2020 IoT grant legwork (Writing proposal, due November 19th)
      • Preparing talk for LPSS
      • Organizing CV demo for LT4All conference
      • Bergamot WebSite (Re-working the partner page)
      • New H2020 grant proposal "AIZEN" (Writing proposal, due November 19th)

Agenda (28/10/2019)

  • Announcements

    • 2020 Planning Meeting
    • All-Hands demos?
  • Review of on-going work

    • Eren

      • TTS w/TF (Slow)
      • Merging bidirectional TTS decoder
    • Alex

      • Fixed framabook and wikisource data extractors for unicode normalization
      • Fixing evaluate_tflite
      • Running a non-optimized french model for WER comparision of representative_dataset usage
    • Reuben

      • Landing UTF-8 changes
        • Merged and rebased preliminary changes
        • Replacing current character-based mode with UTF-8 mode
        • Shift byte values by 1 to keep alphabet size 256
      • Writing blog post about new things in DeepSpeech 0.6.0
      • Experiments with Mandarin data from OI
    • Tilman

      • Importer fixes
      • Importing CV-de, M-AILABS, SWC and TUDA on cluster
      • Thinking about presenting about online hard example mining in Journal Club next week
    • Kelly

      • Voice 2020 planning
        • 2 day planning meeting
        • Writing up 2020 OKR's
      • New H2020 grants legwork
        • Writing up grant for secure WoT work including on-device STT
        • Writing up grant for AIZEN, PhD "interns" at Mozilla
      • Helping security group navigate H2020 grants

Agenda (21/10/2019)

  • Announcements

    • Some nodes' GPUs are falling off the PCIe bus (2 or 3 GPUs)
  • Review of on-going work

    • Eren

      • TF2 implementatin of DeepSpeech
        • Started training on multi-GPU (on local machine)
        • Then will try the cluster
      • Forward-backwards attention still training
      • Fixed some workstation problems
      • Will continue experiments on TTS
      • Train Tacotron with x-vectors
        • Not good quality
    • Alex

      • Fixed client memory leak last week
        • Valgrind is completely happy with our code now
      • Using feeding code to extract a representative dataset for TFLite quantization
      • Submitted and got accepted to talk at an event about Common Voice and DeepSpeech. Event is on 16/11/2019.
      • Contacted by French company interested in using DeepSpeech in their products for people with a hearing impairment.
      • Benchmarking different batch sizes with CuDNN RNN and French data.
      • Disabling early stopping and training for longer helps the accuracy of the French model (WER 9.5% w/ 15 epochs -> 7.5% w/ 20 epochs).
    • Reuben

      • Experiments with Mandarin data from OI
        • Re-importing from scratch with all checks and clean-up passes in a single script to make sure nothing is left out.
        • Then OI can use the script to check the data as it's provided by the vendor.
      • Adapting scorer clean-up patches to replace current character-level code paths with codepoint-level logic + utf-8 instead
      • Re-testing utf-8 checkpoints with test/export graph fix (need to re-test baseline)
    • Tilman

      • Cluster set-up last week went smoothly
      • Today realized some GPUs are dropping off the PCIe bus
        • Doing some stress tests
        • Conterintuitively things seem more stable under stress, but need more testing
      • Did some binary search for max batch size on new machines, seems like we can get 180 on the Q6000
      • Talked to lawyers about German copyright(-equivalent) law
      • Thinking about presenting about online hard example mining in Journal Club next week
    • Kelly Did not attend

Agenda (14/10/2019)

  • Announcements

    • At Meta-Fourm Mozilla won the Meta Seal of Recognition
    • Server installation this week!
    • Power going out in the Berlin office Wed/Thur nights, but not in the server rooms!
  • Review of on-going work

    • Eren

      • Train Tacotron with x-vectors
      • Implementing different TTS architectures (Nx conv encoder)
      • Implementing TTS in TF 2.0
    • Alex

      • WS API Patch work with A and R
      • WS API 2 out of 3 positive reviews!
      • PR for exposing LM parameters
      • Creating a french model v0.3
      • Experimenting with metadata for tflite, e.g. graph version
    • Reuben

      • Experiments with Mandarin data from OI
      • Adapting scorer clean-up patches to replace current character-level code paths with codepoint-level logic + utf-8 instead
      • Testing 5127 checkpoint against other datasets
      • Re-testing utf-8 checkpoints with test/export graph fix
      • pt-BR training runs after reducing sentence duplication in the dataset
      • Importing CV 2.0 English in the cluster
      • Improving how native client handles sample rates
    • Tilman

      • Spoken Wikipedia data/importer
      • Prepating statistics about LibriVox
    • Kelly

      • EU 9 month review
      • Server installation
      • Recommendation for Rishi
      • Letter of support for EAR proposal
      • SIFIS-Home H2020 grant proposal
      • AIZEN H2020 grant proposal
      • Preparing for 2020 planing meeting at MV next week
      • Writing ApS Termination Report for NMT H2020 grant

Agenda (07/10/2019)

  • Announcements *

  • Review of on-going work

    • Eren traveling to conference

    • Alex

      • Refactoring of documentation to RST format
      • Looking into memory leaks reported by Carlos
      • Landed m-AI Labs importer
      • Working on French model with m-AI Labs dataset
    • Reuben

      • Creating Mandarin LM from Wikipedia using CV scripts
        • LM leads to very marginal improvement in CER
        • Latin alphabet being present in LM vocabulary biases decoder towards beams with Latin characters because they get scored first
        • Building the LM from the same data but excluding Latin characters to see how big the impact is
        • Depending on results need to think about how to make the model handle Latin alphabets for things like brand names, country names, etc.
      • Trying (unsuccessfully) to replicate the weirdly low validation loss from 5127 run (compared to test loss)
    • Tilman

      • First NPR training run!
        • Investigating weird results from 5127 run: high test WER, high test loss compared to validation
      • Writing German importers for
        • "Spoken Wikipedia Corpora" <- focusing on this one, kind of complicated to import
        • "German Distant Speech Corpus (TUDA)"
        • ...
    • Kelly Traveling to Meta Forum

Agenda (30/09/2019)

  • Announcements

    • Register for All-Hands!
  • Review of on-going work

    • Eren

      • Working on multi-speaker model (Got 900 speakers working, females work better)
      • Enabled using multiple datasets for TTS
      • Training Tacotron2 with x-vectors
      • Looking to move to TF2.0
    • Alex

      • Trying to work around the duplication problems in French Common Voice (Duplicates hurt WER a lot)
      • Integrating DeepSpeech into WebThings for on-device STT
      • Evaluating M-AILABS dataset for French (Looking good fot STT use)
      • Taskcluster: Fixing disk space issue on macOS
    • Reuben

      • Tackled the UTF-8 decoder head-on
      • UTF-8 close to grapheme performance
      • UTF-8 Mandarin training runs with Magicdata
      • Creating Mandarin LM from Wikipedia using CV scripts
    • Tilman

      • First NPR training run!
      • Writing German importers for
        • "Spoken Wikipedia Corpora"
        • "German Distant Speech Corpus (TUDA)"
        • ...
    • Kelly

      • ICLR 2020
        • Finshed submission with Rishi
      • LPSS 2020
        • Writing submission
        • Preparing plenary
      • Preparing for Meta Forum 2019 - European Language Grid conference
        • Got poster printed
        • Preparing talk
      • Jugend hackt
        • Preparing talk for Friday
      • EU NMT grant contractural work
        • Sent grant amendment (Aps to GmbH) to EU waiting on them to approve
        • Wrote document for deliverable D6.1, code for Firefox integration, and video of it working.
        • Monthly finance report/time cards
      • New H2020 Grant
        • Writing grant
        • Organizing meetup of coalition
        • Organizing funds for Intelligentsia to help with grant writing
      • Branding "Kickoff" meeting for TTS and STT
      • 100k hours engineering sync up meeting

Agenda (23/09/2019)

  • Announcements

    • London ReWork
    • Interspeech
    • Meta-Fourm
    • New server delivered
  • Review of on-going work

    • Eren

      • Implement Duration Predictor
      • Train Duration Predictor
      • Compute phoneme alingments with attention
      • Try multi-head attention
    • Alex

      • Unified documentation!
      • Examples now run in PR's!
      • RPi4 works in realtime!
    • Reuben

      • Baseline Mandarin grapheme runs
      • Testing data augmentation
      • Augmentation documentation
    • Tilman

      • First NPR forced alignment run done!
      • Fixing some data problems
    • Kelly

      • ICLR 2020
        • Writing/revising submission with Rishi
      • LPSS 2020
        • Writing submission
      • Preparing for Meta Forum 2019 - European Language Grid conference
        • Getting poster printed
        • Preparing talk
      • EU NMT grant contractural work
        • Got C. Beard sign the GA Declaration + Accession Form
        • Got the Opinion Letter of the Leaving Beneficiary signed by the ApS
      • New H2020 Grant

Agenda (09/09/2019)

  • Announcements

    • N/A?
  • Review of on-going work

    • Eren (OOO)

      • Multi-speaker embedding
      • Multi-headed attention test
      • Implement Duration Predictor
    • Alex (PTO)

    • Reuben

      • Baseline Mandarin grapheme runs
      • Testing data augmentation
      • Data augmentation PR
      • API cleanup
    • Tilman

      • First NPR forced alignment run done!
    • Kelly

      • New servers on order should be built in 3 weeks
      • First version of Firefox build with Edinburgh NMT addition
      • Preparing for Meta Forum 2019 - European Language Grid conference
        • Designing poster
        • Purchasing poster holder
        • Getting poster printed
      • EU NMT grant contractural work
        • Getting C. Beard sign the GA Declaration + Accession Form
        • Getting the Opinion Letter of the Leaving Beneficiary signed by the ApS

Agenda (02/09/2019)

  • Announcements

  • Review of on-going work

    • Eren

      • Working on speaker vocoder, trying to overfit test set
      • Got multi-speaker working well
    • Alex (PTO)

    • Reuben

      • Baseline Mandarin grapheme runs
      • Testing data augmentation
      • Profiling native client
    • Tilman

      • Forced alignment
      • Started testing with a subset of the NPR data
      • Almost finished transcribing NPR data
    • Kelly

      • New servers on order should be built in 3 weeks
      • Working with OI over Mandarin data set mitigation plan
      • Debugging German Firefox build with Edinburgh NMT addition
      • Reviewing rasa's approach to conversational AI
        • Read "Rasa: Open Source Language Understanding and Dialog Management"
        • Read "Few-Shot Generalization Across Dialog Tasks"
        • Went through getting started tutorial
        • Going through STT/TTS tutorial
      • EU NMT grant contractural work
        • Getting C. Beard sign the GA Declaration + Accession Form
        • Getting the Opinion Letter of the Leaving Beneficiary signed by the ApS

Agenda (26/08/2019)

  • Announcements

  • Review of on-going work

    • Eren

      • Testing new non-linearity
      • Got server setup and set uot to others for DE TTS
      • Working on getting VCTK data set working
      • Talking to Vestel on possible partnership
    • Alex (did not attend)

    • Reuben

      • TTS C/C++ API
      • Baseline Mandarin grapheme runs
      • Decoder refactor
    • Tilman

      • Forced alignment
      • Working on edge cases from alignemnt algorithm to deal with weaknesses in STT model
      • Finished the README for aligner and some related polish work
      • Parallelizing DeepSpeech transcription process
      • Started testing with a subset of the NPR data
    • Kelly

      • Catching up on email
      • Celtic Language Technology Workshop Keynote
      • The Next Web/Common Voice interview
      • EU NMT grant contractural work
        • Finding out how the contract should be changed
        • Becoming LEAR got GmbH
        • Writing letter for ApS to leave grant
        • Gathering documents required for the GmbH to sign to enter the grant
      • Getting Firefox build setup to integrate the Edinburgh's NMT engine server-side

Agenda (19/08/2019)

  • Announcements

  • Review of on-going work

    • Eren

      • Merging gradual training (gradual decrease of r-value during training)
      • In the dev branch, waiting for tests before merging into master
      • Started work on German dataset again
      • Decided to have a demo model for Telekom
      • If demo goes well we'll be better able to record a better German dataset
    • Alex (did not attend)

    • Reuben

      • TTS C/C++ API
      • Baseline Mandarin grapheme runs
      • Decoder refactor
    • Tilman

      • Forced alignment
      • Working on edge cases from alignemnt algorithm to deal with weaknesses in STT model
      • Finished the README for aligner and some related polish work
      • Parallelizing DeepSpeech transcription process
      • Then will start testing with a subset of the NPR data
    • Kelly (Celtic Language Technology Workshop)

Agenda (12/08/2019)

  • Announcements

    • Journal Club on PTO, conflicts with ET Monthly All Hands
    • Register for a conference if you want:
  • Review of on-going work

    • Eren

      • Just back from PTO
    • Alex

      • Just back from PTO
    • Reuben

      • TTS C/C++ API
      • Baseline Mandarin grapheme runs
    • Tilman (PTO)

    • Kelly

      • Working w/Google W3C on modifications to the Web Speech API
      • Working on SNAFU ApS vs GmbH
        • Got PIC number validated
        • In the process of getting LEAR appointment
      • Preping for Celtic Language Technology Workshop
      • New H2020 grants legwork
      • Ordering new worker servers

Agenda (05/08/2019)

  • Announcements

    • Journal Club
      • Rishi Walking backwards on Sesame Street - An evaluation of context independent word vectors derived from context dependent ones
    • Register for a conference if you want:
  • Review of on-going work

    • Eren (PTO)

    • Alex(PTO)

    • Reuben

      • TTS C/C++ API
      • Landed CuDNN RNN PR
      • Fixed TF 1.14 regression
      • Baseline Mandarin grapheme runs
    • Tilman

      • Forced alignment
        • Writing README for repo
        • Finishing up alpha version
        • Recursive split/decent approach for remaining short/bad-matches
    • Kelly

      • ACL/NMT/NLP for Conversational AI conferences
      • NMT Project WebSite updates
      • NMT Project Coalition Meeting
      • Working w/Google W3C on modifications to the Web Speech API
      • Working on SNAFU ApS vs GmbH
        • Got PIC number validated
        • In the process of getting LEAR appointment
          • LEAR appointment letter
          • Declaration of Consent
          • Legal Representative identity document
          • LEAR identity document
      • Preping for Celtic Language Technology Workshop
      • New H2020 grants legwork
      • Ordering new worker servers (Waiting for Godot to sign)
      • Reviewing CV's/Interviewing for grant positions (On hold due to ApS vs GmbH SNAFU)

Agenda (15/07/2019)

  • Announcements

    • Journal Club volunteers?
  • Review of on-going work

    • Eren

      • Train TTS in Libri-TTS with 300 speakers
      • Implemented Deep Griffin–Lim via 1903.03971
      • Try to predict phase from amplitude spectrogram
      • Added speaking embedding with Thomas
    • Reuben

      • Apartments!
      • Setup of new laptop
      • Preparing CuDNN RNN PR
      • Fixing TF 1.14 regression
      • Baseline Mandarin grapheme run
    • Alex

      • Improving macOS build time
      • Fixing at Firefox crash on windows
      • Firefox
        • Implementing json model descriptor
        • Implementing model download from about:deepspeech
        • Implemented libdeepspeech download from about:deepspeech
    • Tilman

      • Forced alignment
        • Writing README for repo
        • Finishing up alpha version
    • Kelly

      • Submitted grant w/Te Hiku Media to NZ government
      • Working on SNAFU ApS vs GmbH
        • Getting VAT Extract from Bundeszentralamt für Steuer
        • Got registration extract for the GmbH from Handelsregister in Berlin
        • Filled out FEL Form private entity form for GmbH
        • Got signature from Chris Beard on FEL Form
        • Starting process of getting LEAR appointment
      • New H2020 grant legwork
      • Ordering new worker servers
      • Reviewing CV's/Interviewing for grant positions (On hold due to ApS vs GmbH SNAFU)

Agenda (08/07/2019)

  • Announcements

    • Journal Club volunteers?
    • ICTurkey in Istanbul
  • Review of on-going work

    • Eren

      • ICTurkey in Istanbul
      • Train TTS in Libri-TTS with 300 speakers
      • Improving on Griffin–Lim with 1903.03971
      • Work on optimizing WaveRNN for inference
      • Starting on pruning work, reading research
      • Adding speaking embedding with Thomas
    • Reuben

      • Apartments!
      • TF 1.14 update
      • Baseline Mandarin grapheme run
      • Rebasing CuDNN RNN branch to PR after 1.14 lands
    • Alex

      • Fixing TC problems with npm
      • Updated nodejs versions
      • Fixed nodejs destructor crashes
      • Firefox
        • Implementing json model descriptor
        • Implementing model download from about:deepspeech
        • Implemented libdeepspeech download from about:deepspeech
    • Tilman

      • Forced alignment
    • Kelly

      • Met with Viamo + GIZ on possible use of DS in Viamo's many IVR installs
      • Working on Te Hiku Media grant
      • Working on SNAFU ApS vs GmbH
      • New H2020 grant legwork
      • Ordering new worker servers
      • Reviewing CV's/Interviewing for grant positions on hold due to SNAFU (ApS vs GmbH)

Agenda (01/07/2019)

  • Announcements

    • Rishi is going to present at week's Journal Club!
  • Discussion

    • No topics?
  • Review of on-going work

    • Eren

      • Work on optimizing WaveRNN for inference
      • Trying to train Libri TTS w/300 speakers
        • Created working model w/24KHz sampling
        • Working on model w/16KHz sampling
      • Starting on pruning work, reading research
      • Working on adding phase prediction net to Giffin-Lim
      • Adding speaking embedding with Thomas
      • Going to Istanbul for H2020 meeting
    • Reuben (Moving)

      • Moving!
    • Alex

      • Investigating nodejs 9x issues
      • Cleaning French Common Voice data/sentences
    • Tilman (PTO)

      • Forced alignment
    • Kelly

      • Telekom NDA signed!
      • Ordering new worker servers
      • Reviewing CV's/Interviewing for grant positions on hold due to SNAFU (ApS vs GmbH)
      • Working on SNAFU, met with auditor, financial advisor, and meeting with legal later today
      • Finished NMT First Dissemination Plan, turned into EC.
      • Finished NMT Data Plan, turned into EC.

Agenda (24/06/2019)

  • Announcements

    • Deep Speech reached 100k downloads
    • All-Hands
      • On-Device WebSpeech API demo at all-hands well recieved
      • DeepSpeech + androidspeech merged into an experimental Firefox Reality branch
    • 0.5.1 and v0.6.0-alpha.0 "Out the door"
    • Starting work on another German Government grant working with DFKI
    • Eren is going to present at week's Journal Club! (Really this time!?)
  • Discussion

    • Torrent models? (Statistics on model downloads?)
  • Review of on-going work

    • Eren

      • Commandline TTS tools
      • Working on universal vocoder
      • Traning with global style tokens
      • About to train with global style tokens
      • Got funding attend the H2020 conference in Istanbul
      • Got German TTS working to Telekom demo
    • Reuben

      • Moving!
      • Fixing macOS task failures
      • Working on testing loss normalizaion
    • Alex

      • Updating ds-srv to 0.5.1
      • Looking at performance of DS on RPi 4
      • Cleaning French Common Voice data/sentences
    • Tilman (PTO)

      • Forced alignment
    • Kelly

      • Ordering new worker servers
      • Got NDA back from Telekom now our lawyers are looking over their changes
      • Reviewing CV's of applicants wanting to join the NMT project
      • Interviewing applicants wanting to join the NMT project
      • Dealing with legal SNAFU on EU grant with respect to the office the purchases are made from
      • Dealing with legal SNAFU on EU grant with respect to the office the employees are hired into
      • Working on NMT First Dissemination Plan, due to the EU June 30th

Agenda (31/05/2019)

  • Announcements

    • 0.5.0 ("Decoder optimizations" PR and done?)
    • Rishi at NAACL this week presenting a paper SPARSE: Structured Prediction using Argument-Relative Structured Encoding
    • Starting work on another H2020 grant working with other EU partners (CNR, RISE, Luminem, Intel, PoliTO) dealing with secure IoT where we'd be lead partner
    • Eren is going to present a next week's Journal Club
  • Review of on-going work

    • Eren

      • Training Tacotron with the "Voice of Mozilla"
      • Looking for better German dataset
      • Working on Tacotron to merge it with WaveRNN
    • Reuben (PTO)

      • Decoder optimizations PR
      • TF and TFLite Refactor
    • Alex

      • Deep Speech (TFLite) on device with Firefox
        • Initial integration
        • Integration with proper threading
        • Turning on sandbox
        • TC Green!
      • Working on French model
      • Cleaning French Common Voice data/sentences
    • Tilman

      • Continued noise1-only training
      • Deep Speech (TFLite) on Firefox Reality
        • Implementing streamable downloding of models
        • Testing streamable downloding of models
    • Kelly

      • Sent out NDA to Telekom
      • Ordering new worker servers
      • ParaCrawl based models legal issues
      • Reviewing CV's of applicants wanting to join the NMT project
      • Working on NMT First Dissemination Plan, due to the EU June 30th
      • Working on NMT Data Management Plan, due to the EU June 30th

Agenda (31/05/2019)

  • Announcements

    • Rishi starts this week
  • Review of on-going work

    • Eren

      • First results from Tacotron 2 on "Voice of Mozilla" data set
      • Training Tacotron with the "Voice of Mozilla"
    • Reuben

      • Aishell runs (UTF-8 branch promising)
      • Other Mandarin importers
      • Writing model packaging requirements doc
    • Alex

      • Deep Speech (TFLite) on device with Firefox
        • Initial integration
        • Integration with proper threading
        • Turning on sandbox
      • Working on French model
      • Cleaning French Common Voice data/sentences
    • Tilman

      • Continued noise1-only training
      • Deep Speech (TFLite) on Firefox Reality
        • Implementing streamable downloding of models
        • Testing streamable downloding of models
    • Kelly

      • Prepared talk for re*work Boston
      • Gave re*work Boston talk on DS+CV
      • ParaCrawl based models legal issues
      • NDA for Telekom
      • Looking TTS German training data
      • Reviewing CV's of applicants wanting to join the NMT project
      • Obtaining new worker servers
      • Working on NMT First Dissemination Plan

Agenda (20/05/2019)

  • Announcements

    • Rishi starts again next week
  • Review of on-going work

    • Eren

      • Examination of the "Voice of Mozilla" data set
      • Training various models with the "Voice of Mozilla" (Seems to not work well.)
      • Training Merlin
      • Applied to H2020 pair making conference, looking for others to work with
    • Reuben

      • Aishell runs (UTF-8 branch promising)
      • Other Mandarin importers
      • Reviewing streaming decoder PR 2121
    • Alex

      • Deep Speech (TFLite) on device with Firefox
        • Initial integration
        • Integration with proper threading
      • Working on French model
      • Interview with French newspaper tomorrow
      • Meeting in 1 week w/French government on CV
    • Tilman

      • Cleaning up snakepit user code
      • Deep Speech (TFLite) on Firefox Reality
      • Getting Android dev env setup for FxR + DS
    • Kelly

      • Preparing talk for re*work Boston
      • Re-Engaged with Marketing for naming/branding of STT & TTS
      • Writing up Rishi's new Onboarding Plan
      • ParaCrawl based models legal issues
      • NDA for Telekom
      • Looking TTS German training data
      • Reviewing CV's of applicants wanting to join the NMT project
      • Obtaining new worker servers
      • Working on NMT First Dissemination Plan
      • Met with OI on Data Commons

Agenda (13/05/2019)

  • Announcements

    • Journal Club Volunteers
  • Review of on-going work

    • Eren

      • Aligning with Gentle
      • Testing word based TTS
      • Training various models with the "Voice of Mozilla" (Seems to not work well.)
      • Reducing vocoder size, pruning + arch changes
      • Discourse TTS
    • Reuben

      • Immigration bureaucracy
      • Aishell runs (UTF-8 branch promising)
      • Fixed pre-processing of data (Spaces incorrect)
      • Other Mandarin importers
    • Alex

      • PSU Replacement
      • Demangled symbols
      • Updated to newest SWIG
      • Removed Python 2.7 support
      • Deep Speech (TFLite) on device with Firefox
    • Tilman

      • Noise training
      • Fixing image on cluster
      • Cleaning up snakepit user code
      • Adding port forwarding to cluster snakepit
      • Running test of standard 0.5.0 training run against the noise test set
    • Kelly

      • Met with OI on sentence collection lessons
      • Met with spoken.io on possible collaboration
      • Preparing talk for re*work Boston
      • Working on internal mana page for ML
      • Working on NMT First Dissemination Plan
      • Reviewing Deep Speech PR 2111

Agenda (06/05/2019)

  • Announcements

    • Kelly at re:publica
    • Serious Firefox incident over the weekend, recommend watching the project meeting
  • Review of on-going work

    • Eren

      • Continuing training on Mozilla voice dataset
        • Comparing different runs to find settings for next run
      • Reading papers to investigate different architectures (alternative to Tacotron)
      • Collaborating with Thomas Werkmeister on Global Style Tokens implementation
      • Will apply for German govt. grant
      • Created Discourse category for TTS
      • Mozilla dataset voice collection done (25h)
        • For now data is not releasable, only for internal use
    • Reuben

      • AISHELL baseline grapheme training run
        • UTF8 run got similar performance to German CV2 runs
      • Test epochs as well as native client can't handle super large alphabets due to caching of logits
        • Uses too much memory
        • Test epoch needs to be pipelined instead of caching all logits in memory
        • Native client would need to have a streaming decoder
    • Alex

      • LinguaLibre/TrainingSpeech/French CV2 training run
        • Fixing importers
      • Training inside Docker
      • Creating a training workflow directly from a Common Voice release to be able to iterate quickly when there's new data
      • Creating report on our work for last year for French govt. grant
    • Tilman

      • Noise augmentation runs
        • Halved LR runs not looking so promising
      • Snakepit large file FS operations (upload/download) with continue
        • Should release soon
    • Kelly (did not attend, at re:publica)

Agenda (29/04/2019)

  • Announcements

    • Live access to TensorBoard?
  • Review of on-going work

    • Eren

      • Training on Mozilla voice dataset, various models. We've about 20 hours of (clean) data
        • Overfitting data too clean
        • Using WaveRNN
        • Pruning for future looks on-device use cases
    • Reuben

      • UTF-8 (Germany Training on all Germany CV data)
      • May integrate "Portuguese streaming patch"
      • AISHELL tests
    • Alex

      • Meta-data exposed to all bindings!
      • Added NodeJS 12 support
      • Docker for fr training
      • WebSpeech API backed but on-device Deep Speech
    • Tilman

      • Back from PTO
      • Matrix run with no-noise, noise 1, and noise 2
      • Continuing no-noise on noise 1 w/more epochs
      • Augmented run with lower learning rate
      • Snakepit features [pit exec + pit (big) file transfer]
    • Kelly

      • 2-byte run w/ low LR
      • Working on getting more info on MR requirements
      • Working on a validation plan for when we obtain "MT" mandarin data
      • Met with legal on Telekom partnership
      • Translating Hindi data from Latin script to Devanagari script
      • Getting H2020 positions into GreenHouse and posted on Mozilla's jobs site
      • Deriving MOS scores from the latest TTS tests
      • 0.5.0 v1 training run
      • Preparing talk for re;publica
      • Preparing talk for re*work Boston
      • Preparing talk for LPSS

Agenda (15/04/2019)

  • Announcements

    • Alex: PTO
    • David Bryant: In Berlin this week
  • Review of on-going work

    • Eren

      • Start training on Mozilla voice dataset, various models. We've about 20 hours of (clean) data
      • Release for LJSpeech and WaveRNN models
      • Trying experiments with forward attention
      • Trying to create music with Tacotron!
    • Josh

      • UTF-8 (English as Multi-Byte idea)
      • Writing NSF grant proposal
      • Writing Mozilla Fellow grant proposal
    • Reuben

      • UTF-8 (Germany Training on all Germany CV data)
        • Fixing importer to not throw away data
        • Remove punctuation from text
        • Longer runs
      • May integrate "Portuguese streaming patch"
    • Alex (PTO)

    • Tilman

      • Working on GPT-2 ideas/demo
      • Augmentation work training continuing
      • pit added various commands + bug fixes
    • Kelly

      • Kick-off meeting EU grant audit + payroll firm
      • Working on getting more info on MR requirements
      • Working on a validation plan for when we obtain "MT" mandarin data
      • Debugging the n-byte runs
      • H2020 grant poster design
      • Reading Neural Ordinary Diffirential Equations
      • Journal Club Presentation
      • Automatic Summarization demo
      • Small platform STT demo (VW Demo)

Agenda (08/04/2019)

  • Announcements

    • TBD
  • Review of on-going work

    • Eren

      • Start training on Mozilla voice dataset, various models. We've 15+ hours of (too clean!) data
      • Met with Telekom + Amazon on TTS, possible joint work with Telekom pending NDA
      • Working on new release for LJSpeech for Tacotron 2
      • Created demo of current voice reading "The Pocket Article" for MOS test against other commercial engines
      • Trying experiments with forward attention
      • Trying to create music with Tacotron
    • Josh

      • UTF-8 (English as Multi-Byte idea)
      • Writing NSF grant proposal
      • Writing Mozilla Fellow grant proposal
    • Reuben

      • Update Deep Speech to newer TF Data API's, TF MFCC's
      • Trained w/Deep Speech on newer TF Data API's, TF MFCC's, 1.13
      • Add version info to exported graphs
      • Adding to extended info, transcript probability
      • Fixing crash bug, no way to properly deallocate transcripts
      • UTF-8 (Germany Training on all Germany CV data)
    • Alex

      • Working on WebSpeech API in Firefox (DeepSpeech server on GPC)
      • Windows Python bindings!
      • Add Windows Python packages to upload tasks
      • Selective registration and/or limiting CUDA compute compatibility to 3.5 to limit distro size
    • Tilman

      • Augmentation work continuing (Fixed bug in voice-corpus-tool)
      • Train, dev, test split of noise data
      • Fixing issue #2020
      • Working on GPT-2 ideas/demo
      • Fixing Common Voice 2.0 importer
    • Kelly

      • Kick-off meeting EU grant audit + payroll firm this Friday
      • Training initial Hindi models using GramVaani (100 hr) data set
      • Met with GIZ to work on structure of long term partnership
      • Working on getting more info on MR requirements (Meeting w/Janice tomorrow)

Agenda (18/03/2019)

  • Announcements

    • Kyrgyz Voice Technology Hackathon
  • Review of on-going work

    • Eren

      • Refactored BN changes
      • Collecting/reviewing voice talent data batch 21-22 (9.8 hours)
      • Training Nancy with T2+WaveRNN (Dropout hack seems to work a bit)
      • Working on WaveRNN precision w/Gaussian and Gaussians
      • Start training Tacotron on Mozilla voice dataset (we've 10 hours of (too clean!) data)
    • Josh

    • Reuben

      • TF 1.13 (Infra problems maybe solved!)
      • Updating Deep Speech to newer TF Data API's, TF MFCC's
      • Training w/Deep Speech on newer TF Data API's, TF MFCC's, 1.13
    • Alex

      • Working on WebSpeech API in Firefox
      • Deep Speech windows taskcluster support
      • Integrated Deep Speech in to Firefox Reality browser!
    • Tilman

      • Augmentation work continuing
      • Fixing Common Voice 2.0 importer
      • Reading GPT-1 and GPT-2 papers
      • Journal club presentation on GPT papers
      • Reviewing PR 1919
    • Kelly

      • Meet w/IT to discuss WebSpeech API for Firefox DeepSpeech Backend Deployment
      • Reviewing PR 1919
      • Finally got permission to hire for the EU grant from Chris Beard
      • Negotiating contract with EU grant's audit + payroll firm (data protection terms)
      • Getting DS product requirements from MR
      • Hindi importer for GramVaani

Agenda (11/03/2019)

  • Announcements

  • Review of on-going work

    • Eren

      • Replace prenet dropout with BN for test time consistency
      • BN Leading to much improved models (Notice the breathing!)
      • Check next batch of recordings
      • Organize new batches from voice talent
      • Trained WaveRNN on Tacotron2 specs
      • Created pocket article with WaveRNN
      • Extract 10bit audio for WaveRNN training
      • Train WaveRNN 10bit
      • Implement state transfer for multiple sentence synthesis
      • Try Tacotron1 with BN prenet
    • Josh

    • Reuben

      • Immigration bureaucracy!
      • TF 1.13
      • Updating Deep Speech to newer TF API's
    • Alex

      • Experimenting with some model download logic in mozillaspeechlibrary/MR browser
      • WebSpeech API in Firefox
      • Deep Speech windows taskcluster support
      • Integrating Deep Speech in to Firefox Reality browser
    • Tilman

      • Fixed and restarted augmentation script
      • Fixed and deployed fix for apt-daily service problem on workers
      • Started reading GPT-1 and GPT-2 papers
      • Coding on pit exec
    • Kelly

      • Met with Goethe & Aaaron.ai (DS + CV partner?)
      • Met with Mycroft
      • Writing Hindi importer (Handed off to absin1)
      • Administrative tasks for EU grant for NMT
        • Unsticking payment to finance management firm
        • CASA ticket for new auditor + payroll provider
        • Obtained new auditor + payroll provider contract
      • Starting the process of obtaining new worker servers
        • Two quotes from Server Bau, 8xTitan RTX and 8xRTX 2080TI
        • Two quotes from BOXX, 8xTitan RTX and 8xRTX 2080TI

Agenda (25/02/2019)

  • Announcements

    • Kigali Hackathon Blog Post
    • VW Progress
    • New Nodes (Model size question)
    • Update on Cluster status
  • Review of on-going work

    • Eren

      • Train new master with Nancy
      • Transfer learning for Mozilla voice
      • Check batch 6-8 recordings
      • Organize new batches from voice talent
      • Enable process based multi-GPU training for TTS on cluster
      • Train Tacotron2 on LJSpeech
    • Josh

      • Got UTF-8 working on cluster for Slovenian
      • Working out alphabet.txt and LM issues for zh
    • Reuben

      • Immigration bureaucracy!
    • Alex

      • Experimenting with some model download logic in mozillaspeechlibrary/MR browser
      • Identifying broken sentences on Common Voice
      • Validated 3600+ french sentences on Sentence Collector tool
    • Tilman

      • Some fixes on the cluster (proxy, apt settings)
      • Working on job-individual pairing of workers with their daemon
    • Kelly

      • Wrote CorporaCreator PR to remove all sentences with digits
      • Wrote CorporaCreator PR to sync documentation with removal of all sentences with digits
      • Wrote letter of support for National Library of Wales project to use DS + CV for transcribing their Welsh holdings
      • Re-negotiated completion date of Voice Talent's constract
      • Supplied re-formatted sentences to Voice Talent w/out phonetic spelling to speed their workflow
      • Sent DS TFLite demo to VW
      • Wrote up instructions for VW on using the DS TFLite demo
      • Walked VW through installation of the demo
      • Agreed to joint Mozilla + GIZ presentation at re:publica
      • Talked to OI on details of the simplified Chinese Mandarin text corpus
      • Reading BERT for Journal club
      • Interviewing payroll providers + obtaining quotes

Agenda (11/02/2019)

  • Announcements

    • Kigali Hackathon
    • HD failure on cluster (Fixed. Thanks to Tilman!)
    • Update on Cluster status
  • Review of on-going work

    • Eren

      • Creating corpus for TTS voice talent
      • Released TTS on LJSpeech!
        • Smart init for RNN
        • Queing of frame length
    • Josh (US Holiday)

      • Background reading on Chinese ASR
      • Waiting for cluster for UTF8 runs
    • Reuben

      • Getting streaming decoder re-based + working
        • Implies API changes
        • Implies beam issues of intermediate vs final decode
      • Starting DS tests on the cluster
    • Alex (Did not attend)

    • Tilman

      • Updated server to newest snakepit
      • Need to figure out CUDA TF version questions on cluster
      • Fixed HD!
    • Kelly

      • EU Grant Finance + Management signed 3 year contract
      • Got finance approval for 2 new headcounts for EU grant
      • Interviewing payroll + audit providers
      • Talked with VW on DS use
      • Talked with OI + VW on Mandarin data

Agenda (11/02/2019)

  • Announcements

    • Kelly at Kigali Hackathon
    • HD failure on cluster
  • Review of on-going work

    • Eren

      • Creating corpus for TTS voice talent
        • Different corpora from open TTS datasets + Common Voice
        • Trying to find optimal set of sentences to give to talent
      • Trained TTS on LJSpeech
        • Trying phonemes instead of graphemes
        • It lead to overfitting
      • Started writing a wiki page for TTS repo with information on training/dataset quality
      • Second meeting with voice talent on thursday, to try out a new batch with less weird words
    • Josh

      • Background reading on Chinese ASR
      • Waiting for cluster for UTF8 runs
    • Reuben

      • Getting CuDNN RNN working on DeepSpeech + TF 1.13 (2x faster training)
        • Benchmarking clients with changes to model
        • Need to verify speedup on cluster machines
    • Alex (Did not attend)

    • Tilman

      • Updated server to newest snakepit
      • Pretty much done, now figuring out networking capabilities for multi-node jobs
        • When running multi-worker jobs, when jobs are stopped the whole head node can crash, taking down all jobs
        • LXD API does not help with preventing it
        • Trying to find a working alternative for creating virtual networks between the nodes
        • Other alternative is to drop network isolation between jobs for now
        • Will try some experiments with Josh and Eren's jobs
      • Beeping sound in server room, one of the hard drive bays was showing a failure
        • Took out failing drive and ordered a replacement
        • Replacement comes in on Wednesday
    • Kelly (Did not attend)

Agenda (04/02/2019)

  • Announcements

    • Common Voice + Deep Speech Hackathon in Kigali next week
    • Common Voice work week in Berlin
  • Review of on-going work

    • Eren

      • Started work on M-AI Labs dataset
        • Getting en-UK working (Data set too noisy)
        • Audiobooks with different speakers, en-UK is single speaker (Data set too noisy)
        • Different voice style for different books, so not very good for TTS
      • Training on German dataset
        • Same problems as en-UK
      • Preparing for a small general release of TTS
        • Blog post about TTS changes
      • Switching vocoder when cluster is updated
      • Creating corpus for TTS voice talent
    • Josh

      • MAML + Bytes are all you need....
      • Settling back in
    • Reuben

      • Bytes for Deep Speech
      • Pytorch deployment (Looking at jit and the like)
      • Automation of Windows builds
      • Arch exploration (Bottle neck at 3rd layer)
    • Alex (PTO)

      • Working on Java/Android support
        • PR is ready for review
        • Has Android 7.0 and 8.1 APKs
        • Prepares things for publishing on Maven
        • Getting help from Android Components team
        • PR also adds Android tests (running in x86 emulator)
        • Looking into making tests faster on AWS
      • Postponed cleaning up SWIG generated types
      • Built and ran mozillaspeech library with DeepSpeech in Firefox Reality
    • Tilman

      • Updating server to newest snakepit
    • Kelly

      • PR's for CorporaCreator
      • Preping for Common Voice + Deep Speech Hackathon in Kigali
      • Common Voice work week this week in Berlin
      • TTS Voice talent starts this week

Agenda (28/01/2019)

  • Announcements

    • EU funds in the bank for NMT project
    • Welsh Language Technology Conference
    • MAML + Bytes Paper at InterSpeech (Deadline: Mar 29) or ACL (Deadline: Mar 4)?
  • Review of on-going work

    • Eren (PTO)

      • Started work on M-AI Labs dataset
        • Getting en-UK working
        • Audiobooks with different speakers, en-UK is single speaker
        • Different voice style for different books, so not very good for TTS
      • Training on German dataset
        • Same problems as en-UK
      • Preparing for a small general release of TTS
        • Blog post about TTS changes
    • Josh

      • MAML + Bytes are all you need....
    • Reuben

      • Bytes for Deep Speech
    • Alex (PTO)

      • Working on Java/Android support
        • PR is ready for review
        • Has Android 7.0 and 8.1 APKs
        • Prepares things for publishing on Maven
        • Getting help from Android Components team
        • PR also adds Android tests (running in x86 emulator)
        • Looking into making tests faster on AWS
      • Postponed cleaning up SWIG generated types
      • Built and ran mozillaspeech library with DeepSpeech in Firefox Reality
    • Tilman

      • Final stages of refactoring Snakepit
        • Code is pretty much complete, now debugging
        • Debugging sequelize queries
        • Added in queries to filter jobs
        • Added ability to add job info to display
        • Added continuation
        • Importing old job data into DB
      • Looking for material for Journal Club presentation
    • Kelly

      • PR's for CorporaCreator
        • Fixing de quotes
        • langs flag
        • isset correction
        • Reviewing XOR user splitting method of train,dev,test
      • Attended + Presented at Welsh Language Technology Conference
      • Attended MoFo meeting on GIZ partnership
      • Sync Meeting with Andreas Boven's team looking at new form factors for Firefox

Agenda (21/01/2019)

  • Announcements

    • Kelly at Browser Translation Kick-Off meeting
  • Review of on-going work

    • Eren

      • Started work on M-AI Labs dataset
        • Getting en-UK working
        • Audiobooks with different speakers, en-UK is single speaker
        • Different voice style for different books, so not very good for TTS
      • Training on German dataset
        • Same problems as en-UK
      • Preparing for a small general release of TTS
        • Blog post about TTS changes
    • Josh

      • ICML paper
        • German finished but performs very badly
        • Maybe due to language model defficiencies
        • Interpretability of different layers
        • Ran alphabet shuffling experiments but can't yet make sense of it
        • Finishing paper text today
    • Reuben

      • Transfer learning experiments on transfering DeepSpeech to speech/non-speech classification
      • Journal club presentation on "Bytes are all you need"
      • TTS Pytorch JIT in C++ (on hold)
    • Alex

      • Working on Java/Android support
        • PR is ready for review
        • Has Android 7.0 and 8.1 APKs
        • Prepares things for publishing on Maven
        • Getting help from Android Components team
        • PR also adds Android tests (running in x86 emulator)
        • Looking into making tests faster on AWS
      • Postponed cleaning up SWIG generated types
      • Built and ran mozillaspeech library with DeepSpeech in Firefox Reality
    • Tilman

      • Final stages of refactoring Snakepit
        • Code is pretty much complete, now debugging
        • Debugging sequelize queries
      • Looking for material for Journal Club presentation
    • Kelly (Traveling - Browser Translation Kickoff Meeting)

      • Could not attend

Agenda (07/01/2019)

  • Announcements

    • ICML paper
    • Hindi Data Set
    • Common Voice alpha data set release
    • Deep Speech release 0.4.0 and 0.4.1!
  • Review of on-going work

    • Eren

      • Phoneme based training
      • Starting on AI Labs data sets
      • Selected Voice123 voices to contract
      • Trained another network with TWEB dataset
      • Created demo read article for Firefox Listen MOS study
    • Josh (Traveling)

      • ICML paper
      • Multi-Task learning
      • Common Voice data set release
    • Reuben (PTO)

      • TTS Pytorch JIT in C++
      • "Bytes are all you need"
    • Alex (PTO)

      • Maven integration for Android
      • Testing for Android on Task Cluster
      • Deep Speech release 0.4.0 and 0.4.1
    • Tilman

      • Snakepit LXD integration
      • Snakepit MySQL integration
    • Kelly

      • ICML paper
      • Common Voice alpha data set release
      • Next Deep Speech release 0.4.0 and 0.4.1
      • H2020 Stuff (Hiring Financial manager, 2 Headcounts, payroll service + getting EU funds + logo design + Domain Name + WebSite)
      • Building many Probing and Trie language models for v0.5.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.5.0

Agenda (07/01/2019)

  • Announcements

    • ICML papers
    • Hindi Data Set
    • Common Voice data set release
    • Next Deep Speech release
      • Trained the new model (Best model yet! 8.26% WER on Librispeech clean test)
      • Release notes updated
  • Review of on-going work

    • Eren

      • Phoneme based training
      • Wrote a Google Collab notebook for training TTS
      • Replaced dropout with RReLU
      • Trained another network with TWEB dataset
    • Josh

      • ICML papers
      • Multi-Task learning
      • Common Voice data set release
    • Reuben

      • Next Deep Speech release
      • TTS Pytorch JIT in C++
      • "Bytes are all you need"
    • Alex

      • Reading Email! :-)
      • Next Deep Speech release
      • Maven integration for Android
      • Testing for Android on Task Cluster
    • Tilman

      • Snakepit LXD integration
      • Snakepit MySQL integration
      • Common Voice data set release
    • Kelly

      • ICML papers
      • H2020 Stuff
      • Next Deep Speech release
      • Common Voice data set release
      • Recruiting H2020 project + financial manager
      • Building many Probing and Trie language models for v0.5.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.5.0

Agenda (10/12/2018)

  • Announcements

    • Demos!
    • Presentations!
    • Next Deep Speech release
      • Train new model
      • Release notes
  • Review of on-going work

    • Eren

      • TTS distributed training
      • Updating to Pytorch 1.0.0
    • Josh

      • NAACL paper
      • Multi-Task learning
      • Common Voice data set release
    • Reuben

      • Windows PR
      • Embedding of meta-data
      • Training for next Deep Speech release
      • Issue 1744 (Increase mfcc step size)
    • Alex (Sick)

      • Integrating Deep Speech in to Firefox Reality
      • Next Deep Speech release
    • Tilman

      • Snakepit LXD integration
      • Common Voice data set release
    • Kelly

      • Common Voice data set release
      • Recruiting H2020 project + financial manager
      • Building many Probing and Trie language models for v0.4.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.4.0

Agenda (19/11/2018)

  • Announcements

    • Demos
      • TTS [Eren]
      • Snakepit [Tilman]
      • STT Streaming [Reuben]
      • AutomaticSummarization [Kelly]
      • STT in Firefox Reality [Alex]
    • Demo Hardware
      • Bluetooth Speaker for Tilman? [Kelly]
      • Bluetooth Headphones for Eren [Kelly]
      • Bluetooth Headphones + Microphone for Reuben [Kelly]
      • USB-C Headset with Mic for Alex [Kelly]
    • Presentations
      • TTS [Eren]
      • STT [Reuben]
      • Snakepit [Tilman]
      • AutomaticSummarization [Kelly]
      • STT in Firefox Reality? [Alex]
    • Nancy Export [Done]
    • Fisher Re-Export [Done]
    • Switchboard Export [Doing]
  • Review of on-going work

    • Eren

      • End-to-end Tacotron + WaveRNN training
      • Exploring info bottleneck of Tacotron
      • Switching parts of Tacotron to Tacotron2 to find problems
    • Josh

      • Learning pit
      • DS Training on Chuvash
      • Data massaging for Chuvash + other languages
    • Reuben

      • Expose ctcdecode to Python
      • Snapdragon 835 port of Deep Speech
    • Alex

      • Investigating TFLite
      • Snapdragon 835 port of Deep Speech
      • Investigating TFLite on Android using NNAPI
      • Integrating Deep Speech in to Firefox Reality
      • Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
    • Tilman

      • Starting training off augmented data sets HDF5 files
      • Exploring LXD integration
    • Kelly

      • Recruiting H2020 project + financial manager
      • Reviewing H2020 Coalition Agreement for Mozilla specific issues
      • De-Risking and risky H2020 tasks (Project Management, Financial Management....)
      • Building many Probing and Trie language models for v0.4.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.4.0

Agenda (19/11/2018)

  • Announcements

    • Demos
      • TTS [Eren]
      • Esper? [Reuben]
      • Snakepit [Tilman]
      • STT Streaming [Reuben]
      • STT Tatar, Kyrgyz...? [Josh]
      • STT in Firefox Reality [Alex]
    • Failures of TTS cluster based training
    • Fisher re-export
    • Pipsqueak's home?
  • Review of on-going work

    • Eren

      • End-to-end Tacotron + WaveRNN training
      • Exploring info bottleneck of Tacotron
      • Switching parts of Tacotron to Tacotron2 to find problems
    • Josh

      • Learning pit
      • DS Training on Chuvash
      • Data massaging for Chuvash + other languages
    • Reuben

      • Expose ctcdecode to Python
      • Snapdragon 835 port of Deep Speech
    • Alex

      • Investigating TFLite
      • Snapdragon 835 port of Deep Speech
      • Investigating TFLite on Android using NNAPI
      • Integrating Deep Speech in to Firefox Reality
      • Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
    • Tilman

      • Starting training off augmented data sets HDF5 files
      • Exploring LXD integration
    • Kelly

      • Recruiting H2020 project + financial manager
      • Reviewing H2020 Coalition Agreement for Mozilla specific issues
      • De-Risking and risky H2020 tasks (Project Management, Financial Management....)
      • Building many Probing and Trie language models for v0.4.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.4.0

Agenda (12/11/2018)

  • Announcements

    • Pipsqueak's home?
    • Josh at Indiana University this week to work on NAACL-HLT Deep Speech paper
  • Review of on-going work

    • Eren

      • End-to-end Tacotron + WaveRNN training
      • Exploring info bottleneck of Tacotron
      • Switching parts of Tacotron to Tacotron2 to find problems
    • Josh

      • Learning pit
      • DS Training on Chuvash
      • Data massaging for Chuvash + other languages
    • Reuben

      • Expose ctcdecode to Python
      • Snapdragon 835 port of Deep Speech
    • Alex

      • Investigating TFLite
      • Snapdragon 835 port of Deep Speech
      • Investigating TFLite on Android using NNAPI
      • Integrating Deep Speech in to Firefox Reality
      • Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
    • Tilman

      • Starting training off augmented data sets HDF5 files
      • Exploring LXD integration
    • Kelly

      • Recruiting H2020 project + financial manager
      • Reviewing H2020 Coalition Agreement for Mozilla specific issues
      • De-Risking and risky H2020 tasks (Project Management, Financial Management....)
      • Building many Probing and Trie language models for v0.4.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.4.0

Agenda (5/11/2018)

  • Review of on-going work

    • Eren

      • End-to-end Tacotron + WaveRNN training
      • Exploring info bottleneck of Tacotron
      • Switching parts of Tacotron to Tacotron2 to find problems
    • Josh

      • Learning pit
      • DS Training on Kyrgyz
      • Data massaging for Kyrgyz + other languages
    • Reuben

      • Expose ctcdecode to Python and use it in evaluate.py
      • Snapdragon 835 port of Deep Speech
        • Running in to many Op bugs
        • Starting simple CNN STT runs to test RNN alternatives
        • Starting exploring alternatives for broken Ops
    • Alex

      • Investigating TFLite
      • Upgrade of OSX Build Infra
      • Updated TF to newest version
      • Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
      • Investigating TFLite on Android using NNAPI (Some ops not supported)
      • Reviewing "Expose ctcdecode to Python and use it in evaluate.py" PR
    • Tilman

      • Starting training off augmented data sets HDF5 files
      • Exploring LXD integration
    • Kelly

      • Creating/Reviewing 3 year/2019 Language Based Assistants Plan
      • Reviewing "Expose ctcdecode to Python and use it in evaluate.py" PR
      • Recruiting H2020 project + financial manager
      • Reviewing H2020 Grant Agreement for Mozilla specific issues
      • De-Risking and risky H2020 tasks (Project Management, Financial Management....)
      • Building many Probing and Trie language models for v0.4.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.4.0

Agenda (15/10/2018)

  • Announcements

    • v0.3.0 Release!
    • MLPerf wants to use Deep Speech (Wants quantized weights)
    • Josh Meyer joins us today as an intern!
  • Review of on-going work

    • Josh

      • Orientation
      • Journal Club Presentation
    • Eren

      • Tacotron + WaveRNN
      • Starting on Tacotron2 implementation (Alignment seems to fail)
      • Switching parts of Tacotron to Tacotron2 to find problems
    • Alex

      • Investigating TFLite
      • Upgrade of OSX Build Infra
      • Updated TF to newest version
      • Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
      • Investigating TFLite on Android using NNAPI (Some ops not supported)
    • Tilman

      • Starting training off augmented data sets HDF5 files
      • Backing out file system
      • Exploring LXD
    • Reuben

      • New CTC algorithm implementation in native client
      • Python binding of CTC algorithm
      • Snapdragon 835 port of Deep Speech
        • Running in to man Op bugs
        • Starting simple CNN STT runs to test RNN alternatives
        • Starting exploring alternatives for broken Ops
    • Kelly

      • Recruiting H2020 project + financial manager
      • Reviewing H2020 Grant Agreement for Mozilla specific issues
      • De-Risking and risky H2020 tasks (Project Management, Financial Management....)
      • Building many Probing and Trie language models for v0.4.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.4.0
  • Announcements

    • v0.3.0 Release
      • Create release notes
      • Updating README.md's
      • Optimization of lm_weight and valid_word_count_weight
      • Update lm_weight and valid_word_count_weight in repo
      • Find performance for separate LS clean, LS other, and CV (Nice to Have)
      • Testing stuff by hand (Checkpoint, model, Issue 1645 [0.1.X, w/o LM...]....)
  • Review of on-going work

    • Eren

      • Wrote caching data loader
      • Wrote data loader compatible with common TTS data sets
      • Integrated generic data loader with branches
      • Starting on Tacotron2 implementation as Tacotron seems the bottleneck
    • Alex

      • French STT starting with English model (Transfer learning)
      • Investigating TF profiler
      • Preparing to simplify build steps on OS X
    • Tilman

      • Generating augmented data sets HDF5 files
      • Starting training off augmented data sets HDF5 files
    • Reuben

      • Setting up TCN runs
      • Snapdragon 835 port of Deep Speech
        • Running in to man Op bugs
        • Starting simple CNN STT runs to test RNN alternatives
        • Starting exploring alternatives for broken Ops
      • Optimized of lm_weight and valid_word_count_weight
    • Kelly

      • Recruiting H2020 project + financial manager
      • Editing H2020 grant to clarify Mozilla deliverables
      • Creating Common Voice slides for "Jugend hackt" on Wednesday
      • Reviewing H2020 Annotated Model Grant Agreement for Mozilla specific issues
      • De-Risking and risky H2020 tasks (Project Management, Financial Management....)
      • Building many Probing and Trie language models for v0.4.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.4.0 with
        • LibriSpeech Other
        • LibriSpeech Clean
        • Common Voice
      • Update lm_weight and valid_word_count_weight in repo

Agenda (08/10/2018)

  • Announcements

    • v0.3.0 Release
      • Create release notes
      • Updating README.md's
      • Optimization of lm_weight and valid_word_count_weight
      • Update lm_weight and valid_word_count_weight in repo
      • Find performance for separate LS clean, LS other, and CV (Nice to Have)
      • Testing stuff by hand (Checkpoint, model....)
    • Server update, wait on test of LS clean + LS other
  • Review of on-going work

    • Eren

      • Long training run of WaveRNN (Training to convergence)
      • Training NVIDIA WaveNet
        • Fast at inference slow training
        • Can benefit from the cluster
        • Training beyond stop token overfitting as this happens early
        • Found trick, place space before punctuation to improve pronunciation
        • Got a license for Nancy TTS corpus (High quality recording with little echo and background noise)
    • Alex

      • French STT starting with English model (Transfer learning)
      • Trying to switch to gcc7.2 for armv7/aarch64
      • Investigating TF profiler
    • Tilman

      • httpfs + snakepit integration
      • Voice augmentation tooling
        • Generates HDF5 files
        • Put in relative path handling
        • Added pre-computed duration tag to CSV
        • Optimizing so that things complete in a reasonable time
        • Added artifact creation from down/up sampling, formats...
        • Fixing thread pool bug in python (Subprocess of thread in pool hangs)
    • Reuben

      • Snapdragon 835 port of Deep Speech
        • Running in to man Op bugs
        • Starting simple CNN STT runs to test RNN alternatives
        • Starting exploring alternatives for broken Ops
      • Optimization of lm_weight and valid_word_count_weight
      • Update lm_weight and valid_word_count_weight in repo

  • Kelly
    • Recruiting H2020 project + financial manager
    • Editing H2020 grant to clarify Mozilla deliverables
    • Creating QBR slides for STT, TTS, NMT, Summarization, Common Voice...
    • Reviewing H2020 Annotated Model Grant Agreement for Mozilla specific issues
    • De-Risking and risky H2020 tasks (Project Management, Financial Management....)
    • All-Hands mic, amp, mixer... setup
    • Building many Probing and Trie language models for v0.4.0 to benchmark
    • Benchmarking Probing and Trie language models for v0.4.0 with
      • LibriSpeech Other
      • LibriSpeech Clean
      • Common Voice
    • Writing blog post for Common Voice landing in Amazon's Pubic Data sets

Agenda (01/10/2018)

  • Announcements

    • v0.2.1 Release
      • Create release notes
      • Updating README.md's
      • Optimization of lm_weight and valid_word_count_weight
      • Update lm_weight and valid_word_count_weight in repo
      • Find performance for separate LS clean, LS other, and CV (Nice to Have)
      • Testing stuff by hand (Checkpoint, model....)
    • Cluster instabilities?
  • Review of on-going work

    • Eren

      • WaveRNN vocoder
      • Joined positive result tech together
      • Trained model
      • Released model
      • In the background FFTNet
      • Attention scaling experiments 5!
    • Alex

      • Moving to Tensorflow 1.11
      • Fixed non-deterministic output with the streaming model
      • Fixed intermittent test failure on prod models for NodeJS and Python
      • Enforced same sox options as libsox for C++ client
      • Optimizing trie loading
    • Tilman

      • httpfs + snakepit integration
      • snakepit changes[1]
      • Voice augmentation tooling
      • Cluster crashes
    • Reuben

      • Snapdragon 835 port of Deep Speech using Qualcomm's SDK (TFlite experiments)
      • Optimization of lm_weight and valid_word_count_weight
      • Update lm_weight and valid_word_count_weight in repo
    • Kelly

      • Recruiting H2020 project + financial manager
      • Editing H2020 grant to clarify Mozilla deliverables
      • Creating QBR slides for STT, TTS, NMT, Summarization, Common Voice...
      • Reviewing H2020 Annotated Model Grant Agreement for Mozilla specific issues
      • De-Risking and risky H2020 tasks (Project Management, Financial Management....)
      • All-Hands mic, amp, mixer... setup
      • Building many Probing and Trie language models for v0.4.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.4.0 with
        • LibriSpeech Other
        • LibriSpeech Clean
        • Common Voice
      • Writing blog post for Common Voice landing in Amazon's Pubic Data sets
      • Setting up AMI's on Amazon for blog post for Common Voice landing in Amazon's Pubic Data sets

Agenda (24/09/2018)

  • Announcement

    • v0.2.0 Released!
    • Streaming blog post going out!
  • Review of on-going work

    • Eren (PTO)

      • WaveNet vocoder
      • Binary convergence loss
      • Larger Attention filters
      • Softmax predictions
      • Loc-alignment with only average history
      • Bahdenau attention
    • Alex (Conference)

      • Moving to Tensorflow 1.11rc's
      • Hosting a conference in the Paris office
    • Tilman

      • httpfs + snakepit integration
    • Reuben

      • Snapdragon 835 port of Deep Speech using Qualcomm's SDK (TFlite experiments)
    • Kelly

      • All-Hands mic, amp, mixer... setup
      • Building many Probing and Trie language models for v0.3.0 to benchmark
      • Benchmarking Probing and Trie language models for v0.3.0 with
        • LibriSpeech Other
        • LibriSpeech Clean
        • Common Voice
      • Writing blog post for Common Voice landing in Amazon's Pubic Data sets
      • Setting up AMI's on Amazon for blog post for Common Voice landing in Amazon's Pubic Data sets
      • Purchasing Mandarin data sets (LDC and other options)
      • Creating Mandarin data sets (MTurk and other options)

Agenda (17/09/2018)

  • Announcement

    • Streaming blog post going out tomorrow
    • v0.2.0 Release
      • Updating README.md's
      • Testing of rounded model
      • Optimization of lm_weight and valid_word_count_weight
      • Update lm_weight and valid_word_count_weight in repo
      • Update lm_weight and valid_word_count_weight in blog post
      • Find performance for separate LS clean, LS other, and CV (Nice to Have)
      • Getting PyGithub to upload
      • Finalize blog post (Link to original 10% blog post)
      • Finalize release notes (Linking to new API, Mention feature caching increases RAM needed...)
      • Merge Feature caching PR
      • Merge Language Models PR
      • Testing stuff by hand (Checkpoint, model....)
  • Review of on-going work

    • Eren

      • Binary convergence loss
      • Larger Attention filters
      • Softmax predictions
      • Loc-alignment with only average history
      • Bahdenau attention
    • Alex

      • Discourse/github support
      • TF Master merge
      • Tool to push to GitHub
      • Auto upload of assets to GitHub for release
      • TFLite port of Deep Speech on to Pixel 2
    • Tilman

      • httpfs + snakepit integration
      • Adding "pit ls" and "pit cp" to ls and cp files from the server \o/!
      • Working on freesound.org samples (down/up sampling, collecting metadata tags in csv....)
    • Reuben

      • Streaming blog post
      • Creating 0.2.0 release model
      • Rounding 0.2.0 release model
      • Benchmarking 0.2.0 release model
      • Benchmarking 0.2.0 rounded release model
      • Merging quantized trie language model
      • Snapdragon 835 port of Deep Speech using Qualcomm's SDK
    • Kelly

      • All-Hands mic, amp, mixer... setup
      • Administrative tasks for Horizon 2020 Grant
      • NSF + Visa + IP + Other HR Hell
      • Helping with 0.2.0 release tasks
      • Obtaining Mandarin data sets (THCHS-30, MAT-2000Com, MAT-2500ExtV-Com, TCC-300Com...)
      • Purchasing Mandarin data sets (LDC and other options)
      • Creating Mandarin data sets (MTurk and other options)
      • Writing blog post for Common Voice landing in Amazon's Pubic Data sets
      • Setting up AMI's on Amazon for blog post for Common Voice landing in Amazon's Pubic Data sets

Agenda (06/08/2018)

  • Review of on-going work

    • Eren (OoO)

      • Setup TTS server
      • Computer Vision Conference
    • Alex

      • Common Voice Kiosk mode
      • Upgrading VMware Fusion
      • Moving hardware to a new home
      • Discourse/github support
      • TFLite port of Deep Speech on to Pixel 2
    • Tilman

      • httpfs + snakepit integration
      • Working on freesound.org samples (down/up sampling, collecting metadata tags in csv....)
    • Reuben

      • Clear Captions Time Estimate
      • Common Voice Data test training
      • 0.2.0 Optimizations (Pre-processing)
      • 0.2.0 Small bug fixes from alpha testers
      • Snapdragon 835 port of Deep Speech using Qualcomm's SDK
    • Kelly

      • All-Hands mic, amp, mixer... setup
      • Administrative tasks for Horizon 2020 Grant
      • Getting LXC containers to work on the cluster
      • Forced alignment of NPR data using Gentle
      • NSF + Visa + IP + Other HR Hell
      • Obtaining Mandarin data sets (THCHS-30, MAT-2000Com, MAT-2500ExtV-Com, TCC-300Com...)
      • Purchasing Mandarin data sets
      • Writing blog post for Common Voice landing in Amazon's Pubic Data sets

Agenda (06/08/2018)

  • Review of on-going work

    • Alex (PTO)

    • Eren (PTO)

    • Tilman

      • Added 4 GPU machine to staging env
      • Working on freesound.org samples (down/up sampling, collecting metadata tags in csv....)
    • Reuben

      • Common Voice Data
        • Data cleaning (HTML instead of text....)
        • Gotten through one epoch
        • Starting training w/F,SW,L+CV
      • 0.2.0 Optimizations (Pre-processing)
      • 0.2.0 Small bug fixes from alpha testers
      • Streaming architecture blog post
      • Snapdragon 835 port of Deep Speech starting Qualcomm vs Tensorflow
    • Kelly

      • Setting up STT server for MozCast
      • All-Hands setup starting
      • Administrative tasks for Horizon 2020 Grant
      • Getting LXC containers to work on the cluster
      • Forced alignment of NPR data using Gentle
      • NSF + Visa + IP + Other HR Hell
      • Obtaining Mandarin data sets (THCHS-30, MAT-2000Com, MAT-2500ExtV-Com, TCC-300Com...)

Agenda (06/08/2018)

  • Review of on-going work

    • Alex (PTO)

    • Tilman (PTO)

    • Reuben

      • Collating 03.0 changes for merge
      • 0.3.0 Optimizations (Pre-processing)
      • Reading ClairNet(TTS)
    • Eren

      • Profiled TTS code with cProfile
      • Solved TTS server quality
      • Working on new checkpoint release for TTS
      • Audio enhancement to improve GriffinLim quality (EnhanceNet) 1703.09452
      • Working on blog post for TTS
    • Kelly

      • NSF + Visa + IP + Other HR Hell
      • Voice Talent Contract
      • Uploading Rishi's data
      • Pre-processing Rishi's data
      • Obtaining Mandarin data sets (THCHS-30, MAT-2000Com, MAT-2500ExtV-Com, TCC-300Com...)
      • Setting up demo TTS server for internal use
      • Setting up demo STT server for internal use
      • Gathering STT requirements for Mozcast
      • Gathering STT requirements for MR
    • Rishi

      • Summarization
        • Code complete for new ideas
        • Training on Newroom with various hyperparameters
        • Getting really good results on Newroom (Adjusting learning rate, number of layers, embedding dimension)
        • Starting on Transformer (Something a bit buggy)

Agenda (23/07/2018)

  • Announcement

    • Tilman talking at Journal Club
  • Review of on-going work

    • Reuben (PTO)

      • Collating 03.0 changes for merge
      • 0.3.0 Optimizations (Pre-processing)
      • Reading ClairNet(TTS)
    • Eren

      • Trained TTS on LG Speech data set and released model
      • Starting to use LVS vocoder library
      • Started training FFTNet again
      • Starting to train on Mycroft data (Multiple Speech Samples)
      • Running in to load speed problems for LG Speech
    • Tilman

      • Working on fixing Eren's efficiency problem (Seems to be firejail problem)
      • Debugging w/Eren
      • FUSE Filesystem to get around firejail NSF problems
    • Kelly

      • NSF + Visa + IP + Other HR Hell
      • Voice Talent Contract
    • Rishi*

      • Summarization
        • Code complete for new ideas
        • Issues w/beam search + ROUGE solved
        • Training on Newroom with various hyperparameters
        • Getting really good results on Newroom (Adjusting learning rate, number of layers, embedding dimension)
        • Starting on Transformer (Something a bit buggy)

Agenda (23/07/2018)

  • Announcement

    • Taipei
  • Review of on-going work

    • Reuben

      • Collating 03.0 changes for merge
      • 0.3.0 Optimizations (Pre-processing)
      • Reading ClairNet(TTS)
    • Eren (Sick)

    • Tilman

      • Working on fixing Eern's efficiency problem
      • Server install
      • Debugging w/Eren
      • FUSE Filesystem to get around firejail NSF problems
    • Kelly

      • NSF + Visa + IP + Other HR Hell
      • Voice Talent Contract
    • Rishi*

      • Summarization
        • Code complete for new ideas
        • Issues w/beam search + ROUGE solved
        • Data needs to be copied to server (preprocessing bad + data slightly corrupt)

Agenda (16/07/2018)

  • Announcement

    • Kelly in Taipei this week
    • Alex on parental leave Aug 1 - Sep 4
    • Switching ordering of 0.3.0 and 0.2.0 releases, streaming comes first!
  • Review of on-going work

    • Reuben

      • Collating 03.0 changes for merge
    • Eren

      • FFTNET experiemnts
      • Working on pre-processing
      • Starting working with new data sets
      • Starting trying NVIDIA WaveNet on cluster
    • Tilman

      • Working on fixing Eern's efficieny problem
    • Kelly

      • Meeting with Taiwian government on CV+DS
      • Meeting with Taiwian press on CV+DS
      • Meeting with Taiwian community on CV+DS
      • Meeting with Taiwian linguists on CV text corpus collection
    • Rishi*

      • Summarization
        • OpenNMT now supports multi-GPUs!
        • Starting implementing improvements to OpenNMP in latex doc

Agenda (09/07/2018)

  • Announcement

    • Kelly in Taipei next week
    • Alex on parental leave Aug 1 - Sep 4
  • Review of on-going work

    • Reuben

      • Evaluated LM like that from 0.1.1
      • Evaluating decoder with suggestions from community
    • Eren

      • FFTNET experiemnts
      • Working on pre-processing
      • Starting working with new data sets
      • Starting trying NVIDIA WaveNet on cluster
    • Tilman

      • Firejail does not work on NFS!
      • Working on fix, possibly LXC
      • Working on temporary fix for cluster until final solution happens
    • Kelly

      • Email!
      • LM re-creation (Tomorrow)
      • DS + CV lectures in Taipei
      • Meeting with Taiwian government on CV
    • Rishi*

      • Journal Club!
      • Summarization
        • OpenNMT now supports multi-GPUs!
        • Starting implementing improvements to OpenNMP in latex doc

Agenda (02/07/2018)

  • Review of on-going work
    • Reuben

      • AOT compiliation for streaming
      • Figuring out snakepit
      • Training streaming model (Waiting on Cluster)
      • Experimenting with realigning and splitting pt-BR corpus
    • Eren

    • Tilman

      • Fixed several bugs
        • Early stopping and ../tmp file deleting (!)
        • Wrong job startup work directory for jobs on Ubuntu 18.04
      • Put node mlc2 onto cluster and put Eren into group 'test' - so his jobs will always run on mlc2
      • Working on serious issue about insufficient job protection
      • Helping cluster users

Agenda (25/06/2018)

  • Update

    • All-Hands!
  • Review of on-going work

    • Reuben

      • Back from PTO!
      • AOT compiliation for streaming
      • Training streaming model (Waiting on Cluster)
    • Tilman

      • Optimizing Snakepit
      • Demo Snakepit install
      • Installing servers
      • Starting Freesound downloads
    • Eren

      • Writing FFTNET + testing it (Working on memoization)
      • Waiting on cluster!
    • Kelly

      • Setting up server
      • Working with Legal + HR for NSF grant
    • Rishi

      • AS initial arch
      • Initial training of AS archs

Agenda (04/06/2018)

  • Update

    • All-Hands
      • Team Dinner Wednesday
      • Demos
        • STT - Update previous demo + showcase streaming + faster (Memory in Alex's graphs)
        • TTS - Original plan to demo on notebook "MVD" (mbx now wants server) so working on AWS server wrapper
        • Job Scheduler
          • Setting up new cluster on Wednesday
          • Shooting for Snakepit install on Wednesday (Fallback AWS install)
      • Presentations
        • Lighting Talks
          • STT - Completed[1]
          • TTS - Shooting for finishing on WED
        • STT - Intro done[2], streaming to go
  • Review of on-going work

    • Reuben

      • Training 0.3.0 model (Experimenting on various hyperparameters, exchanging LSTM w/Block Fused LSTM...)
      • Looking at curriculum learning variations
    • Tilman

      • Hardening Snakepit (Replacing node addition process)
      • Securing Snakepit configuration steps
    • Eren

      • NVIDIA install
      • Writing FFTNET + testing it (Working on memoization)
      • Training new checkpoint with location sensitive attention + ablation w/tocotron
      • Setting up server
    • Kelly

      • Created PR for switch of language model for 0.2.0
      • Created STT Lightning Talk for All-Hands
      • Created 1st Half of longer STT Talk for All-Hands

Agenda (28/05/2018)

  • Update

    • Servers in the office + power cords + 10Gb cables, trying for full install Wednesday
    • RE*WORK
  • Review of on-going work

    • Reuben

      • Training 0.3.0 model (Experimenting on various hyperparameters)
      • Dropout now not used for validation
      • Maintaining TTS training on Brazilian corpora with TBTT (Samples to long)
      • Drudging through issue 1156 debugging (Rebased on streaming code)
    • Tilman

      • Working on integrating pip cache to prevent multiple downloads
      • Integrating firejail in to job scheduler to prevent interference
    • Eren

      • Released new checkpoint
      • Training new checkpoint with location sensitive attention + stop token prediction
      • Writing FFTNET + testing it
    • Kelly

      • RE*WORK
      • Benchmarking on old server done, compiling results for 0.2.0

Agenda (07/05/2018)

  • Update

    • Servers in the office, going to install everything this week
    • Manager's off-site
      • TTS quality sounding really good
      • STT, awesome work, now try and get 0.2.0 and 0.3. out the door with noise robust models
  • Review of on-going work

    • Reuben

      • Maintaining TTS training on Brazilian corpora with TBTT (Samples to long)
      • Drudging through issue 1156 debugging (Rebased on streaming code)
    • Tilman

      • Added job ground support
      • Added auto share flag for groups (By default shared with particular group)
      • Moving data backend to SQLLite to ease archiving jobs
    • Eren

      • Stop token prediction done
      • But quality drops, attention miss-aligned
      • Integrated Pytorch Griffin-Lim much faster than real time
    • Alexandre

      • Imported 1.2M strings from French Parliament!
      • Starting working on Gutenburg import (1k books)
      • Tried OpenCL for Deep Speech on TF 1.8 (Works!)
    • Kelly

      • Managers Off-Site
      • Lost of benchmarking finished LSTM RNN, GRU RNN, Vanilla RNN, LSTM BRNN w/varying width...
      • Trying to get LM benchmarking off the old server to compile results for 0.2.0

Agenda (07/05/2018)

  • Review of on-going work
    • Reuben

      • Drudging through issue 1156 debugging
      • Cleaning up pending streaming API work to prepare for streaming model training
      • Maintaining TTS training on Brazilian corpora
    • Tilman

      • Optimizations and testing SnakePit with DeepSpeech
      • Demo training with PyTorch to make sure things work
      • Capturing error codes of failed jobs
      • Making SSH sessions to last longer and avoid repeated setups for short polls
    • Eren

      • Updated to PyTorch 0.4.0 and retrained, but quality is worse
      • Could be PyTorch bug or problem in upgrade, investigating
      • Tested much faster GPU Griffin-Lim implementation, slightly lower quality
      • Stop token prediction is done and works well
      • Looking into NVIDIA's Tacotron 2 implementation for optimization tips and to use as a Tacotron 2 reference
    • Alexandre

      • Working on French parliament corpus to make it suitable for training
      • Working on packaging code to improve documentation of packages on PyPI/npm
      • Playing with newer versions of OpenCL
      • Debugging TensorFlow/KenLM linking error due to double-conversion library clash
    • Kelly (Managers Off-Site)

Agenda (30/04/2018)

  • TODO

    • Talk about eager distribution of alpha/beta for testing
  • Review of on-going work

    • Reuben(PTO)

    • Tilman(PTO)

    • Eren

      • Released a new checkpoint (170k steps)
      • Working on WORLD vocoder impl to replace Griffin-Lim
      • Transfer learning experiments from RNN to CNN
      • TBTT to deal with long sample
    • Alexandre

      • RPi + LePotato taskcluster cluster
      • Getting physical location for cluster
      • French text data for CV
      • Starting work on GPU versions for ARM boards
      • Worked on a clean version + distribution mechanism
    • Kelly

      • Emails
      • Starting to summarize results of benchmarks

Agenda (23/04/2018)

  • Update

    • Servers in the office, electrician done
  • Review of on-going work

    • Reuben
      • Working on TTS deployment - how hard it'd be to stand up an internal test server
      • Deep Speech streaming C API
    • Tilman
      • Job scheduler
        • Implemented resource reservation/permission scheme based on groups
        • Users and devices have groups assigned and need to match for user to access device
        • Per-group folders are mounted on run directory when user has appropriate permissions
        • Implementing resource allocation for ports for inter-node communication
    • Eren
      • Released a new checkpoint (170k steps)
      • Working on WORLD vocoder impl to replace Griffin-Lim
    • Alexandre (could not attend)
    • Kelly (out for the week)

Agenda (16/04/2018)

  • Update

    • Server delivery was scheduled for today, but Kelly is out
    • Check Journal Club presentation
  • Review of on-going work

    • Reuben
      • Looking into TTS end-of-sequence prediction
      • Deep Speech streaming C API
      • Reviewing PRs
    • Tilman
      • Job scheduler
        • Error handling UI polish
        • Reporting timing of state changes for jobs
        • Implementing user groups + group permissions for accessing restricted resources
    • Alexandre
      • Merged RPi3 testing on TaskCluster
      • Two RPi3's running at home linked to TaskCluster
      • Missing 2.7 support in Raspbian for NumPy and SciPy packages, so dropped 2.7 builds for RPi3
      • Debugging mysterious Node v8/v9 crashes in RPi3 with Valgrind/ASAN (workaround by going back to GCC 4.9)
    • Eren (had to go before the meeting)
    • Kelly (out for the week)

Agenda (09/04/2018)

  • Update

    • Servers in Berlin, but working on delivery + movers as we no longer have a freight elevator
    • Electricians back from vacation may come in this week
  • Review of on-going work

    • Eren
      • TTS variations/improvements
        • Vocoder variation
      • STT CNN exchange for BRNN
    • Reuben
      • pt-BR TTS
      • Deep Speech streaming C API
      • Streaming blog post
    • Kelly
      • Talking with partners
      • Cluster creation: Contracting Electrician
      • Benchmarking: Issues 1241, 1242, 1243, 1246 (All does except vanilla RNN)
      • Language model corpus creation: Issues 1244 and 955 (Dealing with server crashes)
      • Starting Sprachspiel implementation (Value Head, Attention, Decoder)
      • Learning PyTorch
    • Tilman
      • Job scheduler
        • Monitoring
        • Status report
    • Alexandre (PTO)

Agenda (26/03/2018)

  • Discussion

    • Release notes (Include links to release README)
      • Link to proper README in PyPi, npm....
  • Review of on-going work

    • Eren
      • TTS variations/improvements
      • STT CNN exchange for BRNN
    • Reuben
      • WER from inference graph
      • Deep Speech streaming
      • Streaming blog post
    • Kelly
      • Cluster creation: Contracting Electrician
      • Benchmarking: Issues 1241, 1242, 1243, 1246
      • Language model corpus creation: Issues 1244 and 955
      • Starting Sprachspiel implementation (Value Head, Attention, Decoder)
      • Learning PyTorch
    • Tilman (PTO)
    • Alexandre (PTO)

Agenda (12/03/2018)

  • Discussion

    • Use of GitHub Projects
    • Release notes (Include links to release README)
      • Link to proper README in PyPi, npm....
  • Review of on-going work

    • Eren (On-Boarding)
      • TTS variations/improvements
      • STT CNN exchange for BRNN
    • Tilman
      • Job scheduler
      • Exploring other schedulers (Servers are coming real soon™)
    • Reuben
      • WER from inference graph
      • Deep Speech streaming
    • Kelly
      • Cluster creation: Contracting Electrician
      • Benchmarking: Issues 1241, 1242, 1243, 1246
      • Language model corpus creation: Issues 1244 and 955
      • Learning PyTorch
      • Starting Sprachspiel implementation (Value Head, Attention, Decoder)
    • Alexandre
      • French Common Voice
      • TF OpenCL support

Agenda (05/03/2018)

  • Discussion

    • Release of RC's
      • Confirm that's RC' don't install automatically. (Confirmed!)
    • Release notes (Include links to release README)
      • Link to proper README in PyPi, npm....
  • Review of on-going work

    • Eren
      • TTS variations/improvements
      • STT CNN exchange for BRNN
    • Tilman
      • Job scheduler
      • Exploring other schedulers (Servers are coming real soon™)
    • Reuben
      • Deep Speech streaming
      • Training/Inference calculate MFCC's the same
    • Kelly
      • Cluster creation: Contracting Electrician
      • Benchmarking: Issues 1241, 1242, 1243, 1246
      • Language model corpus creation: Issues 1244 and 955
      • Learning PyTorch
      • Starting Sprachspiel implementation (Value Head, Attention, Decoder)
    • Alexandre
      • French Common Voice
      • TF OpenCL support

Agenda (26/02/2018)

  • Discussion

    • Quick meeting today (Monthly Internal Meeting starts in 30min)
  • Review of on-going work

    • Eren
      • TTS variations/improvements
      • STT CNN exchange for BRNN
    • Tilman
      • Job scheduler
      • Exploring other schedulers (Servers are coming real soon™)
    • Reuben
      • Deep Speech streaming
    • Kelly
      • Cluster creation: Contracting Electrician
      • Benchmarking: Issues 1241, 1242, 1243, 1246
      • Language model corpus creation: Issues 1244 and 955
      • Learning PyTorch
      • Starting Sprachspiel implementation
    • Alexandre
      • French Common Voice
      • TF OpenCL support

Agenda (19/02/2018)

  • Discussion?

  • Review of on-going work

    • Eren (Out sick)
    • Tilman
      • Job scheduler
    • Reuben
      • Deep Speech streaming
    • Kelly
      • Cluster creation: Contracting Electrician
      • Mozilla's letter of support for CDT in NLP
      • Benchmarking: Issues 1240, 1241, 1242, 1243, 1246, 1254
      • Language model corpus creation: Issues 1244 and 955
      • Learning PyTorch
      • Starting Sprachspiel implementation
    • Alexandre
      • TF OpenCL support

Agenda (05/02/2018)

  • Release 0.1.1!

    • Congrats!
  • Discussion

    • FOSDEM!
    • Work week initial agenda[1]
    • Work week presentation template[2].
    • Release process for future reference[3]
  • Review of on-going work

    • Eren
      • Work Week Presentations!
      • TTS Engine
        • Initial code repo not completely working, debugging
        • Following work on Vocoder
    • Tilman
      • Work Week Presentations!
    • Reuben
      • Work Week Presentations!
      • Issue 1156 (Language model incorrectly drops spaces)
      • TTS Engine
        • Creating TTS data
        • Creation of MAD+VAD to handle music
        • Creation of data sets to train MAD+VAD
    • Kelly
      • Work Week Presentations!
      • Working on getting UPS+PDU (Sent off PO!)
      • Conversational agent research
      • Preparations for Work Week
      • Interviews (AI4All, Newsy, Contagious Magazine, Let's Get Mental)
      • Prioritizing Deep Speech partnerships
    • Alexandre
      • Work Week Presentations!
      • New OS X workers
      • Spec'ing out new OS hardware
      • Monolithic TF [done]

Agenda (29/01/2018)

  • Release 0.1.1

    • Release testing [Reuben] (done)
    • Release management [Reuben + Alex] (done?)
    • Document hyperparameters for release notes [Kelly]
    • Release now!
  • Discussion

    • Work week initial agenda[1]
    • Work week presentation template[2].
  • Review of on-going work

    • Eren
      • TTS Engine
        • Initial code repo not completely working, debugging
        • Following work on Vocoder
    • Tilman
      • FOSDEM Presentation!
    • Reuben
      • Issue 1156 (Language model incorrectly drops spaces)
      • TTS Engine
        • Creating TTS data
        • Creation of MAD+VAD to handle music
        • Creation of data sets to train MAD+VAD
    • Kelly
      • Working on getting UPS+PDU (Waiting on Finance's PO)
      • Conversational agent research
      • Preparations for Release 0.1.1
      • Preparations for Work Week
      • Common Voice Work Week
    • Alexandre
      • New OS X workers
      • Spec'ing out new OS hardware
      • Monolithic TF

Agenda (22/01/2018)

  • Release 0.1.1

    • Document training from pb [Reuben] (done)
    • Create release notes [All] (done)
      • Add leading section of Changes [All] (done)
      • Add list of contributors since last release [Kelly] (done)
    • Release testing [Reuben]
    • Release management [Reuben + Alex]
  • Discussion

    • Work week initial agenda[1]
      • Monday
        • Pipsqueak - Deep Speech on RPi3 (¼ Day) [Alex]
          • Presentation on Pipsqueak (1 hour)
          • Feedback/Suggestions on Pipsqueak (1 hour)
        • Deep Speech streaming support (⅜ Day) [Tilman]
          • Presentation on Deep Speech streaming support (1 hour)
          • Feedback/Suggestions on Deep Speech streaming support (1 hour)
        • Lunch
          • Feedback/Suggestions on Deep Speech streaming support (1 hour)
        • Pipsqueak - Architectural variations (¼ Day) [TBD]
          • Presentation on Pipsqueak (1 hour)
          • Feedback/Suggestions on Pipsqueak - Tie in with Deep Speech streaming support (1 hour)
      • Tuesday
        • Job Scheduler (¼ day) [Tilman]
          • Presentation on Job Scheduler (1 hour)
          • Feedback/Suggestions on Job Scheduler (1 hour)
        • Deep Speech additional languages (¼ Day) [Kelly+Michael]
          • Common Voice additional languages (1 hour) [Michael]
          • Deep Speech additional languages (1 hour) [Kelly]
        • Lunch
        • Virtual Assistant Work (⅛ Day) [Kelly]
          • Introduction to Scout - Mozilla's Virtual Assistant
          • Survey of Possible Asks - Intent parser, keyword spotter...
        • Automatic Summarization (⅛ Day) [Kelly]
          • Introduction to Mozilla's Automatic Summarization Work
          • Mozilla's Automatic Summarization Corpora
        • Dinner
      • Wednesday
        • TTS engine initial architecture (½ Day) [Reuben+Eren]
        • Lunch
        • Automatic Summarization Firefox integration (¼ Day) [Kelly+Martin]
          • 3rd Party Automatic Summarization integration (Unknown) [Martin]
          • Mozilla's Automatic Summarization integration (Unknown) [Martin]
        • TTS + STT Firefox integration [Kelly + All]
      • Thursday
        • Heads Down Working
        • Lunch
        • Heads Down Working
      • Friday
        • Heads Down Working
        • Lunch
        • Heads Down Working
    • Work week presentation template[2].
  • Review of on-going work

    • Eren
      • Orientation
      • TTS Engine
    • Tilman
      • FOSDEM Presentation
    • Reuben
      • Issue 1156 (Language model incorrectly drops spaces)
      • TTS Engine
        • Creating TTS data
        • Creation of MAD+VAD to handle music
        • Creation of data sets to train MAD+VAD
    • Kelly
      • Working on getting UPS+PDU (Waiting on Finance's PO)
      • Conversational agent research
      • Preparations for Release 0.1.1
      • Preparations for Work Week
      • Common Voice Work Week
      • AlphaGo Zero Presentation
    • Alexandre
      • New OS X workers
      • Spec'ing out new OS hardware
      • Monolithic TF

Agenda (15/01/2018)

  • Release 0.1.1

    • Integrating frozen model testing into our main code [Reuben] (done)
    • Create pb file [Kelly] (done)
    • TODO: Document training from pb [Reuben]
    • TODO: Create release notes [All]
      • Add leading section of Changes [All]
      • Add list of contributors since last release [Kelly]
  • Discussion

    • Adversarial Attacks (1801.01944, 1801.00554...)
    • Work week initial agenda
      • Job Scheduler
      • Deep Speech streaming support [Point Person:Tilman]
      • Deep Speech additional languages [Point Person:Kelly]
      • Pipsqueak (Deep Speech on RPi3) (Should also explore arch variations) [Point Person:Alex]
      • TTS engine initial architecture [Point People:Reuben+Eren]
      • Virtual Assistant Work (Initial possibilities: Intent parser, keyword spotter, conversational agent...)
      • Automatic summarization
  • Review of on-going work

    • Eren
      • Orientation
      • TTS Engine
    • Tilman
      • Corpus creation
      • Audio augmentation
      • Job scheduler
      • Addressing PR comments
    • Reuben
      • Issue 1156 (Language model incorrectly drops spaces)
      • TTS Engine
    • Kelly
      • Working on getting new hardware (Everything signed, waiting on payment + delivery) [done]
      • Working on getting UPS+PDU (Quote took too long to process, new quote active and in flight) [doing]
      • Conversational agent research
      • Preparations for Release 0.1.1
    • Alexandre
      • New OS X workers

Agenda (08/01/2018)

  • Announcements

    • Eren!
    • Work week: Week of February 12
  • Release 0.1.1

    • Train model with same hyperparameters as in 0.1.0 model [Kelly] (done)
    • Harmonize WER with Kaldi's WER (done) [WER 5.6% on librivox clean test]
    • Integrating frozen model testing into our main code [Reuben] (done not merged)
    • TODO: Create pb file [Kelly]
    • TODO: Link deepspeech properly on macOS (@executable_path) [Alex] Fixed in #1051
    • TODO: Document training from pb [Reuben]
  • Discussion

    • Broad 2018 Goals
      • Ship 3 ML based technologies and/or services in Firefox
      • Release Speech-to-Text engine + models w/<10% WER on 2 other data sets
      • Release Speech-to-Text engine and non-English model
      • Release RiP3 Speech-to-Text engine and English model(s)
      • Release Text-to-Speech engine + English model(s)
      • Common Voice as Largest Open English Corpus + another language
      • Support Assistants group with required ML algorithms and models
      • Explore ML + Data business models
    • Work week initial agenda
      • Deep Speech streaming support
      • Deep Speech additional languages
      • Pipsqueak (Deep Speech on RPi3)
      • TTS engine initial architecture
      • Virtual Assistant Work (Details dependent upon Jofish's team's needs)
  • Review of on-going work

    • Eren
      • Orientation
      • TTS Engine
    • Tilman
      • Corpus creation
      • Audio augmentation
      • Job scheduler
      • Addressing PR comments
    • Reuben
      • Issue 1156 (Language model incorrectly drops spaces)
      • TTS Engine
    • Kelly
      • Training production model on cluster [done]
      • Spec'ing out more servers [done]
      • Working on getting new hardware (Everything signed, waiting on payment + delivery) [done]
      • Working on getting UPS+PDU (Waiting on Bebenita) [doing]
    • Alexandre (Sick)

Agenda (20/12/2017)

  • Release 0.1.1

    • Harmonize WER with Kaldi's WER [Kelly or Tilman]
    • Train model with same hyperparameters as in 0.1.0 model [Kelly]
    • TODO: Link deepspeech properly on macOS (@executable_path) [Alex]
    • TODO: Integrating frozen model testing into our main code [Reuben]
  • Discussion

    • Date for 2018 kick-off work week

Agenda (04/12/2017)

Review of on-going work

  • TF 1.4 support landed [Alex]
  • Python 2.7, 3.4, 3.5 and 3.6 builds of our TensorFlow fork [Alex]
  • General fixes for problems encountered by users [Alex]
  • Removed dependency on older version of SciPy [Alex]
  • TODO: Link deepspeech properly on macOS (@executable_path) [Alex]
  • Ran benchmarks comparing impact of AVX/AVX2 instructions. Almost no difference in inference time. [Alex]
  • Almost done collecting dataset for summarization [Anurag]
  • Ran benchmarks on GRU vs LSTM. LSTM converging faster [Anurag]
  • TODO: tune hyperparameters over current architecture [Anurag]
  • TODO: test orthogonal RNNs [Anurag]
  • Got test epochs to work with frozen models [Reuben]
  • Fixed data race in feeding code that caused non-deterministic Word Error Rates when running on a single machine [Reuben]
  • TODO: Integrating frozen model testing into our main code [Reuben]
  • Working on voice agent demo [Tilman]
  • Looking into making inference streamable [Tilman]
  • TODO: benchmark runs with different architectures (unidirectional vs. bidirectional) [Tilman]

Discussion

  • Should we disable AVX2 in our TensorFlow packages?
  • Should we do tests on macOS? Alex checking if we can get mac minis as workers.

Agenda (19/11/2017)

  • Release TODO

    • Go live with the Hacks blog post (Wait until Wednesday 8am PST) [Reuben]
  • All-Hands TODO

    • Demo using pb? [Reuben]
  • Discussion

    • TF 1.4 support finish at all-hands
    • How to support JS in future with no SWIG support

Agenda (12/11/2017)

  • Release TODO
    • Test CV data set [Tilman]
    • Update documents to suggest virtual env [Alex]
    • Add GPU whl's to PyPI + update docs [Reuben]
    • Change contact point for PyPI packages [Reuben]
    • Add node packages by hand + update docs [Reuben + Alex]
    • Tag release [Reuben + Alex]
    • Upload model + LibriVox clean audio samples to github releases [Kelly]
    • Remake gif using github releases model + LibriVox clean audio samples [Kelly]
    • Update docs with new gif [Kelly]
    • Go live with the Hacks blog post [Reuben]
    • Inform partners of release [Kelly]
    • Add documentation on GPU vs CPU speed (Talk about numbers from Rueben's computer) [Kelly]
    • Add documentation that model size is not optimized [Kelly]
    • Apply roundings of graph (Nice to have) [Reuben]
    • Write release notes (Send out for review) [Kelly]
    • Email testers to create virtual env (Done) [Kelly]
    • Link to discourse in release notes/readme [Kelly]

Agenda (04/11/2017)

  • Announcements

    • Work week on week of Nov 13 in Berlin
      • Talk about corpus targets (Street, office....)
    • Comms wants us to release on the 21st because of 57
  • Discussion

    • Packaging progress?
      • Training (Looking good but with LM decoder a bit slow) [Kelly]
      • Documentation [Kelly+Reuben]
      • Communications (Golem coming to video/interview us on Nov 15) [Kelly]
      • Discourse forum on Deep Speech (Done) [Kelly]
      • Model testing on Task Cluster [Kelly+ Alex]
        • Give large model to Alex (Done) [Kelly]
      • Finding beta testers [Michael]
  • Review of on-going work

    • Tilman
      • Corpus creation
      • Audio augmentation
      • Addressing PR comments
    • Reuben
      • Reviewing Tilman's PR
      • Trying TensorFlow 1.4 MFCC's
      • Reach out to hacks
      • NPR Importer
    • Kelly
      • Training production model on cluster
      • Specing out more servers
      • Working on getting new hardware
    • Alexandre (PTO)
    • Anurag (In NY,NY)
      • Creating ORNN presentation
      • Formalize corpus creation

Agenda (30/10/2017)

  • Announcements

    • Work week on week of Nov 13 in Berlin
  • Discussion

    • Packaging progress?
      • Training (Trained w/out Fisher. Fisher export corrupted, re-exporting) [Kelly]
      • Documentation [Kelly+Reuben]
        • native_client README.md re-written [Reuben]
      • Communications (Golem coming to video/interview us on the release) [Kelly]
      • Discourse forum on Deep Speech [Kelly]
      • Model testing on Task Cluster [Kelly+ Alex]
        • Give large model to Alex [Kelly]
      • Finding beta testers [Michael]
  • Review of on-going work

    • Tilman (PTO)
    • Reuben
      • Documentation native client README.md
      • Fixing native client little problems, e.g. error messages, what happens when a param is not there
      • Reach out to hacks
    • Kelly
      • Setting up current master to run on cluster
        • Completed run with out Fisher
        • Re-exporting Fisher as it seems to be corrupted
      • Specing out more servers
      • Journal Club A Neural Conversation Model plus other related work?
    • Alexandre
      • Python & Node packages cross-compilation locally
      • Progresses on the use of tfcompile
        • Build time the tfcompile configuration file
        • Audio length now variable
        • Simplifying AOT use
    • Anurag (In NY,NY)
  • TODO

    • One-pagers motivation on github

Agenda (23/10/2017)

  • Announcements

    • Work week on week of Nov 13 in Berlin
  • Discussion

    • Packaging progress?
      • Train model [Kelly]
      • Documentation [Kelly+Reuben]
      • Communications [Kelly]
      • Discourse forum on Deep Speech [Kelly]
      • Model testing on Task Cluster [Kelly+ Alex]
  • Review of on-going work

    • Tilman
      • Cocktail Party Noise importer
      • Re-Review Germany blog post
    • Reuben
      • Documentation native client README.md
      • Fixing native client little problems, e.g. error messages, what happens when a param is not there
      • Reach out to hacks
    • Kelly
      • Setting up current master to run on cluster
      • Specing out more servers
      • Journal Club Get To The Point: Summarization with Pointer-Generator Networks
    • Alexandre
      • Reviewing Tilman's PR
      • Progresses on the use of tfcompile
        • Build time the tfcompile configuration file
        • Audio length now variable
        • Simplifying AOT use
    • Anurag
      • Working on Wikipedia based data sets
  • TODO

    • One-pagers motivation on github

Agenda (16/10/2017)

  • Announcements

    • Update on Berlin office opening, C-Level demos, press coverage
    • Work week on week of Nov 13 in Berlin
  • Discussion

    • Packaging progress?
      • Packing script done
      • Train model [Kelly]
      • Documentation [Kelly]
      • Marketing, comic, Hacks [Kelly]
      • Discourse forum on Deep Speech [Kelly]
      • Model testing on Task Cluster [Kelly+ Alex]
      • Custom CTC decoder in native clients [Reuben]
      • Tool to load checkpoint (Refactor Deep Speech) [Reuben]
      • Update export/loading to new API [?]
  • Review of on-going work

    • Tilman
      • Rebasing code
      • Testing rebased code
      • Cocktail Party Noise importer
    • Reuben
      • Language Model Blog Post (The Deep Speech Journey)
      • Custom CTC in all native clients
      • Reach out to hacks
    • Kelly
      • Setting up current master to run on cluster
      • Specing out more servers
    • Alexandre
      • Progresses on the use of tfcompile
        • Build time the tfcompile configuration file
        • Audio length now variable
        • Simplifying AOT use
    • Anurag
      • Automatic summarization literature review
      • Working on Wikipedia based data sets
  • TODO

    • One-pagers motivation on github

Agenda (09/10/2017)

  • Announcements

    • Report on managers meeting
  • Discussion

    • Packaging progress?
      • Packing script done
      • Train model [Kelly]
      • Documentation [Kelly]
      • Marketing, comic, Hacks [Kelly]
      • Discourse forum on Deep Speech [Kelly]
      • Model testing on Task Cluster [Kelly+ Alex]
      • Custom CTC decoder in native clients [Reuben]
      • Tool to load checkpoint (Refactor Deep Speech) [Reuben]
      • Update export/loading to new API [?]
  • Review of on-going work

    • Tilman
      • Rebasing code
      • Testing rebased code
      • Cocktail Party Noise importer
    • Reuben
      • Language Model Blog Post (The Deep Speech Journey)
      • Custom CTC in all native clients
      • Reach out to hacks
    • Kelly
      • Setting up demo on my laptop
      • Setting up current master to run on cluster
      • Setting up C-Level Common Voice demo
      • Got funding for more servers
    • Alexandre
      • Progresses on the use of tfcompile
        • Build time the tfcompile configuration file
        • Audio length now variable
        • Simplifying AOT use
    • Anurag
      • Automatic summarization literature review
      • Working on Wikipedia based data sets

Agenda (25/09/2017)

  • Announcements

    • Talk from BerlinNLP about Deep Speech + Yandex's STT
      • Kelly's slides[1]
      • Ilya's slides[2]
      • Video[3]
  • Discussion

    • Packaging progress?
      • Packing script done
      • Train model [Kelly]
      • Documentation [Kelly]
      • Marketing, comic, Hacks [Kelly]
      • Discourse forum on Deep Speech [Kelly]
      • Host model/release on github releases
      • Model testing on Task Cluster [Kelly+ Alex]
      • Client allows batch processing [No]
      • Custom CTC decoder in native clients [Reuben]
      • Tool to load checkpoint (Refactor Deep Speech) [Reuben]
  • Review of on-going work

    • Tilman
      • Added module based logging
      • In graph replication single mode working
      • In graph replication cluster mode working
      • Checkpoint logic re-write to do early stopping
    • Reuben
      • Language Model Blog Post (The Deep Speech Journey)
      • Custom CTC in all native clients
      • Demo tool
      • Reach out to hacks
    • Kelly
      • Gave talk in Taipei on Common Voice for Mozilla Developer Conference
      • Writing with IAS[4] Mozilla Research Grant Proposal
      • Writing initial 2018 plan for the Machine Learning group
    • Alexandre
      • OS X taskcluster integration with new MacBook Pro!
      • Optimized task cluster build!
      • Hosting two meetups!
    • Anurag
      • Journal club presentation
      • Reading Google's NMT paper
      • Automatic summarization literature review

Agenda (18/09/2017)

  • Announcements

    • Tomorrow Kelly will talk at BerlinNLP about Deep Speech, will be recorded.
  • Discussion

    • Packaging for alpha release, what needs to be done? (Unblocking community)
      • Setup testing of model on TaskCluster ("Should be easy" -Alex) [Alex]
      • Write script that loads checkpoint + does inference [Reuben]
      • Write documentation [Kelly]
      • Decide where to store models (Language Models + DeepSpeech Models) [gitlfs (LM) + release (GitHub Release)]
      • Native client binaries how they're obtained + installed? [gitlfs, s3...]
      • Release should contain everything: language models + DeepSpeech model + code...
      • Decide how to package: models + frozen model
      • Include a demo? (native client, Reuben’s GUI merged after release [Reuben])
  • Review of on-going work

    • Tilman Debugging PR on dynamic batch size & In graph replication & Cocktail Party
    • Reuben
      • Language Model Blog Post (The Deep Speech Journey)
    • Kelly
      • Traveling to Taipei for Mozilla Developer Conference
      • Creating Common Voice presentation for Mozilla Developer Conference
      • Creating Deep Speech presentation for Berlin NLP Meetup
      • Automatic Summarization massive param study
        • Preparing yaml for basline model variations
        • Getting seq2seq framework to work on the cluster using our enqueue/SLURM framework
      • Issue 625 (Create NewsML Importer) (Trying new Opus VAD)
      • Interviewing job candidates
      • Creating Pocket Corpus
    • Alexandre
      • PR 834 (TaskCluster Decision Task)
    • Anurag
      • Benchmark checkpoints upload
      • Automatic summarization literature review

Agenda (11/09/2017)

  • Announcements

    • Working with legal on:
      • Possible integration of Google open speech corpus in to Common Voice corpus
      • Possible integration of Mythic's open speech corpus in to Common Voice corpus
  • Review of on-going work

    • Tilman Debugging PR on dynamic batch size & Cocktail Party
    • Reuben
      • PR 805 (Score CTC prefix beams with KenLM)
      • Language Model Blog Post
      • Review PR 810 (Local/Remote benchmarking tool)
      • WER Report debugging
    • Kelly
      • Creating Common Voice presentation for Mozilla Developer Conference
      • Creating Deep Speech presentation for Berlin NLP Meetup
      • Automatic Summarization massive param study
        • Preparing yaml for basline model variations
        • Getting seq2seq framework to work on the cluster using our enqueue/SLURM framework
      • Issue 625 (Create NewsML Importer) (VAD is getting tripped up in music)
      • Interview with Vozpopuli
      • Interviewing job candidates
      • Reviewing PR 805 (Score CTC prefix beams with KenLM)
    • Alexandre
      • PR 820 (Issue818+819)
      • PR 810 (Local/Remote benchmarking tool)
      • Reviewing PR 805 (Score CTC prefix beams with KenLM)
      • Journal Club Pete Warden's Book "Building Mobile Applications with TensorFlow"

Agenda (04/09/2017)

  • Announcements

    • Reuben's PR[2] integrating the language model deeper in to the CTC decoder gives WER of 6.48% on Librivox clean
    • With WER of 6.48% on Librivox clean it appears[1] as if we're the best FOSS STT engine
    • Talked with Pete Warden, tech lead of embedded TF + lead of the Google open speech corpus, from Google
      • Possibility of collaboration on Common Voice
      • Wants to work with us on fixing TF bugs preventing Alex from progressing on quantization
    • Tomorrow TV interview with NTN24 on Common Voice + Speech at Mozilla
  • Review of on-going work

    • Tilman Debugging PR on dynamic batch size & Cocktail Party
    • Reuben Journal Club & DS2 Tests & WER Report debugging
    • Kelly
      • Automatic Summarization massive param study
        • Preparing yaml for basline model variations
        • Getting seq2seq framework to work on the cluster using our enqueue/SLURM framework
      • Issue 625 (Create NewsML Importer) (VAD is getting tripped up in music)
      • Interview with NTN24 and Vozpopuli
      • Interviewing job candidates
      • Reviewing PR 805 (Score CTC prefix beams with KenLM)
      • Review Bug 1396158 (Removal of Pocketsphinx from Firefox)
    • Alexandre (Sick)

Agenda (28/8/2017)

  • Announcements

    • Running TED+Librivox+Fisher+Switchboard training on both cluster nodes
      • Librivox clean test a data set
      • Librivox 4-gram language model
      • Librivox clean validation a data set
  • Review of on-going work

    • Tilman(PTO)
    • Reuben CTC Decoder + Language model & DS2 Tests & WER Report debugging
    • Kelly
      • Automatic Summarization massive param study
        • Preparing yaml for basline model
        • Preparing yaml for basline model variations
        • Getting seq2seq framework to work on the cluster using our enqueue/SLURM framework
      • Training TED+LibriVox+Fisher+Switchboad for 13 epochs
      • Issue 625 (Create NewsML Importer) [doing]
      • Issue 791 (Switch language model to OpenSLR's standard LibriSpeech 4gram model)
    • Alexandre C++, Python, and nodejs bindings + AOT (Dealing with compiled in wav file length problem)

Agenda (21/8/2017)

  • Announcements

    • Running Issue 759 (More BiRNN layers no LM) on the server
    • Presentation to Sean+Azita+Katharina on Common Voice "Six Month Plan" went well
    • Finished TED+Librivox+Fisher 13 epoch run
      • WER 27.19% on TED test set
      • Google's WER for TED test is 27.32%[1].
      • TED test set is hard!
  • Discussion

    • Early stopping architecture
      • Checkpoint every epoch in dir1
      • Checkpoint every 10 min in dir2
      • Flag to switch which checkpoint is used
    • Packaging model for "alpha soft release"
      • Baby step TED+Librivox+Fisher+SWD training [Kelly]
      • Baby step verify deployment infra works [Alex]
      • Baby step NPR importer [Kelly]
      • Baby step TED+Librivox+Fisher+SWD+NPR training [Kelly]
    • Adapting engine to any custom Language
      • Reuben integrating this with current LM work
      • Should create an alphabet file with the alphabet in use
  • Review of on-going work

    • Reuben CTC Decoder + Language model + DS2 Tests
    • Tilman(PTO)
    • Anurag Deep Compression[2] training
    • Kelly
      • Automatic Summarization data set creation
      • Automatic Summarization parameter study
      • Automatic Summarization data set pre-processing for seq2seq framework
      • Issue 759+760 (Add more BRNN layers subtract language model)
      • Issue 625 (Create NewsML Importer) [doing]
      • Issue 692 (Adapting engine to any Custom Language) [doing]
      • Preperation for Weekly Journal Club Meeting
    • Alexandre C++, Python, and nodejs bindings

Agenda (14/8/2017)

  • Announcements

    • Currently doing a TED+Librivox+Fisher (Dealing with continuation strangeness)
    • Old server back online at the new offices (Internally accessible. VPN?)
  • Discussion

    • Do we want to revise a site that reports on WER?
  • Review of on-going work

    • Reuben CTC Decoder + Language model + DS2 Tests
    • Tilman Dynamic Batch Sizing
    • Anurag Deep Compression[2] + Benchmarking + Early Stopping
    • Kelly
      • Mycroft PR partnership
      • Automatic Summarization data set creation
      • Automatic Summarization parameter study
      • Six Month Plan for Common Voice
      • Presentation to Sean on Common Voice Six Month Plan
      • Talk with Voice Fill
    • Alexandre(PTO)

Agenda (7/8/2017)

  • Announcements

    • Currently doing a TED+Librivox+Fisher run
    • New servers back online at the new offices
    • Old server still not back online at the new offices
  • Discussion

    • Distributing models, how do we want to do this?
    • Discussion results:
      • Distribute protobuf + checkpoint and tf_compile result
      • Need to choose distribution of training data, i.e. which training sets to train on
      • Fine tune model with a custom data set to target particular use case, TDB
  • Review of on-going work

    • Reuben CTC Decoder + Language model + Journal Club
    • Tilman Dynamic Batch Sizing
    • Anurag Deep Compression[2] + Benchmarking (On hold until old server installed in new Berlin office) + Early Stopping
    • Kelly
      • NPR Importer
      • Mycroft PR partnership
      • New Berlin Office Servers
      • Automatic Summarization Roadmap
      • Six Month Plan for Common Voice
      • Presentation to Sean on Common Voice Six Month Plan
    • Alexandre OS X TC builds + nodejs builds
  • TODO

    • Look into getting Nanshu access to the Berlin cluster [kelly]

Agenda (31/7/2017)

  • Announcements

    • Segfault looks to be solved by transparent huge pages being switched off!!!
    • Currently doing a TED+Librivox+Fisher run
    • Mycroft is making PR's in Common Voice and Deep Speech to follow
    • Deep Speech + Common Voice blog post live Mozilla Blog[1]
    • SoftAtHome interested in helping with putting DeepSpeech on RPi3
    • Meeting with i2x to discuss possible collaborations
    • New servers will be offline sometime this week when moved from the colocation facility to the new offices
    • Old server will be offline this week until new hardware is delivered to "rack it"
  • Review of on-going work

    • Reuben CTC Decoder + Language model
    • Tilman Dynamic Batch Sizing
    • Anurag Deep Compression[2] + Benchmarking (On hold until old server installed in new Berlin office) + Early Stopping
    • Kelly New Berlin Office Servers + Automatic Summarization + NPR Importer[7]
    • Alexandre OS X TC builds + nodejs builds
  • TODO

    • Look into getting Nanshu access to the Berlin cluster [kelly]

Agenda (24/7/2017)

  • Announcements

    • Working on Mycroft partnership (1-2 more devs + contributions to Common Voice) meeting tomorrow
    • Common Voice getting 20k-40k contributions per day
    • RiseML created blog post on distributed Deep Speech but hasn't made it public yet
    • European Language Resource Association interested in partnering on Common Voice
    • Tons of Common Voice press
  • Review of on-going work

    • Reuben CTC Decoder (Integrated in to Tensorflow, but how to expose to external devs)
    • Tilman Segfault Distributed Tensorflow + Dynamic Batch Sizing
    • Anurag Deep Compression[0] + Benchmarking[1][2][3][4][5]... + Early Stopping
    • Kelly Journal Club Kronecker Recurrent Units + Common Voice Press + Automatic Summarization + NPR Importer[7]
    • Alexandre Getting RNN and tfcompile happy together + OS X TC builds + nodejs builds
  • TODO

    • Look into getting Nanshu access to the Berlin cluster (On hold until segfault is solved)
    • Contact Urdu, Macedonian... developers to see if they want to open source models and we host them on S3 say [done]
    • Open issue to allow our code to work on various languages easily [done]

Agenda (17/7/2017)

  • Welcome - Nanshu Wang, Rob Smith, Nicholas Lane

  • Announcements

    • Working on Mycroft partnership (1-2 more devs + contributions to Common Voice)
    • PC Mag covered Common Voice "Mozilla Asks Everyone to Donate Their Voice"
    • Common Voice getting 14k contributions per day
    • RiseML porting code to Google Cloud
    • RiseML creating blog post on distributed Deep Speech
  • Future work


Clone this wiki locally