Skip to content
@cisnlp

Deep NLP @ CIS - LMU

Deep Natural Language Processing Group at Center for Language and Information Processing, University of Munich (LMU)

Popular repositories Loading

  1. simalign simalign Public

    Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)

    Python 376 50

  2. GlotLID GlotLID Public

    đź’¬ Language Identification with Support for More Than 2000 Labels -- EMNLP 2023

    Python 162 10

  3. Glot500 Glot500 Public

    Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023

    Python 104 4

  4. GlotCC GlotCC Public

    🕸 GlotCC Dataset and Pipline -- NeurIPS 2024

    Jupyter Notebook 20

  5. GlotScript GlotScript Public

    đź–‹ Resource and Tool for Writing System Identification -- LREC 2024

    Python 19 2

  6. ofa ofa Public

    A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretraining

    Python 19 2

Repositories

Showing 10 of 39 repositories
  • manchu-in-context-mt Public

    Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu

    cisnlp/manchu-in-context-mt’s past year of commit activity
    Python 7 1 0 0 Updated Sep 27, 2025
  • Language-Mixing Public

    [EMNLP 2025] Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes

    cisnlp/Language-Mixing’s past year of commit activity
    Python 1 0 0 0 Updated Sep 9, 2025
  • KLAR-CLC Public

    [ACL 2025] Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models

    cisnlp/KLAR-CLC’s past year of commit activity
    Python 3 0 1 0 Updated Sep 9, 2025
  • cisnlp.github.io Public

    Homepage of cisnlp

    cisnlp/cisnlp.github.io’s past year of commit activity
    SCSS 3 MIT 2 0 0 Updated Sep 2, 2025
  • GlotWeb Public

    🕸 GlotWeb: Web Indexing for Low-Resource Languages -- under construction.

    cisnlp/GlotWeb’s past year of commit activity
    Python 15 CC0-1.0 0 4 0 Updated Aug 13, 2025
  • cisnlp/MIB-circuit-track’s past year of commit activity
    Python 0 Apache-2.0 0 0 0 Updated Aug 11, 2025
  • GlotLID Public

    đź’¬ Language Identification with Support for More Than 2000 Labels -- EMNLP 2023

    cisnlp/GlotLID’s past year of commit activity
    Python 162 Apache-2.0 10 1 0 Updated Jun 5, 2025
  • cisnlp/spatial_intuitions’s past year of commit activity
    Jupyter Notebook 1 0 0 0 Updated May 27, 2025
  • multilingual-fact-tracing Public

    Tracing Multilingual Factual Knowledge Acquisition in Pretraining

    cisnlp/multilingual-fact-tracing’s past year of commit activity
    Python 6 2 0 0 Updated May 22, 2025
  • MEXA Public

    🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment

    cisnlp/MEXA’s past year of commit activity
    Python 11 Apache-2.0 1 0 0 Updated Apr 6, 2025