SALSA 💃

SALSA Interface | SALSA Tutorial | Download Dataset | Use LENS-SALSA | Paper

Our code and data for Dancing Between Success and Failure: Edit-level Simplification Evaluation using SALSA 💃 to appear at EMNLP 2023.

SALSA Annotation Interface

Our interface is built with thresh.tools and is available at salsa-eval.com/interface with our interactive tutorial at salsa-eval.com/tutorial. Our interface configuration is defined in interface/salsa.yml. The source code for our interactive tutorial is available in interface/tutorial.

SALSA Dataset

Our dataset 12K edit annotations are available in data, you can use the Thresh library to easily load SALSA data for your project! We also include our train/test/validation splits for training LENS-SALSA in data/lens-salsa-training and our data before adjucation (for calculating annotator agreement) in data/non-adjudicated.

pip install thresh
git clone https://github.com/davidheineman/salsa.git

from thresh import load_interface, convert_dataset

# Load the SALSA interface
SALSA = load_interface("interface/salsa.yml")

# We can load our JSON data by using our interface object and calling load_annotations()
salsa_data = SALSA.load_annotations("data/salsa.json")

print(salsa_data[0])

You may use the SALSA data in its json format, or for a full tutorial on the Thresh data tools, please see load_data.ipynb.

LENS-SALSA Automatic Evaluation

You may plug-and-play our LENS-SALSA metric (fine-tuned LENS 🔎 on SALSA 💃 edit-level evaluation) for your simplification project! Our LENS-SALSA metric is capable of both reference and referenceless evaluation.

We re-implemented the original multi-reference LENS using the new COMET implementation, then modeled our design on the unified-objective COMET-22 WMT submission. See lens-salsa for source code and replicating the training setup.

Setup & Usage

pip install lens-metric

from lens import download_model, LENS_SALSA

lens_salsa_path = download_model("davidheineman/lens-salsa") # see https://huggingface.co/davidheineman/lens-salsa
lens_salsa = LENS_SALSA(lens_salsa_path)

complex = [
    "They are culturally akin to the coastal peoples of Papua New Guinea."
]
simple = [
    "They are culturally similar to the people of Papua New Guinea."
]

scores, word_level_scores = lens_salsa.score(complex, simple, batch_size=8, devices=[0])
print(scores) # [72.40909337997437]

# LENS-SALSA also returns an error-identification tagging, recover_output() will return the tagged output
tagged_output = lens_salsa.recover_output(word_level_scores, threshold=0.5)
print(tagged_output)

Analysis & Figures

To replicate the analysis tables and figures in our work please refer to analysis.

Cite SALSA

If you find our paper, code or data helpful, please consider citing our work:

@inproceedings{heineman-etal-2023-dancing,
    title = "Dancing Between Success and Failure: Edit-level Simplification Evaluation using {SALSA}",
    author = "Heineman, David and Dou, Yao and Maddela, Mounica and Xu, Wei",
    editor = "Bouamor, Houda and Pino, Juan and Bali, Kalika",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-main.211",
    pages = "3466--3495"
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
analysis		analysis
interface		interface
lens-salsa		lens-salsa
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SALSA 💃

SALSA Annotation Interface

SALSA Dataset

LENS-SALSA Automatic Evaluation

Setup & Usage

Analysis & Figures

Cite SALSA

About

Languages

License

davidheineman/salsa

Folders and files

Latest commit

History

Repository files navigation

SALSA 💃

SALSA Annotation Interface

SALSA Dataset

LENS-SALSA Automatic Evaluation

Setup & Usage

Analysis & Figures

Cite SALSA

About

Topics

Resources

License

Stars

Watchers

Forks

Languages