LUNA: a Framework for Language Understanding and Naturalness Assessment

The framework provides a set of well-known automated evaluation metrics for text generation tasks.

The library includes the following metrics:

Blanc: paper
Mover score: paper
BLEU: paper
METEOR: paper
ROUGE: paper
chrF: paper
BERTScore: paper
BARTScore: paper
Data statistics metrics: paper
- Compression
- Coverage
- Length
- Novelty
- Density
- Repetition
ROUGE-We: paper
S3: paper
BaryScore: paper
DepthScore: paper
InfoLM: paper

Installation

Installation from the source

Clone the repository and install the library from the root:

git clone https://github.com/Moonlight-Syntax/LUNA.git
pip install .

Another way is to use poetry. Then, run poetry install from the root.

Quick start

The user can either trigger the Calculator to evaluate metrics or integrate the code itself.

Calculator

The easiest way to evaluate NLG models is to execute the following snippet:

from luna.calculate import Calculator

# Choose to compute in a sequential or a parallel setting
calculator = Calculator(execute_parallel=True)
metrics_dict = calculator.calculate(
  metrics=[depth_score, s3_metrics], # both are LUNA's metrics
  candidates=candidates,
  references=references
)

print(metrics_dict)
>>> {"DepthScore": ..., "S3": ...}

Integrate the evaluations

All the metrics in the library follow the same interface:

class Metrics:
    def evaluate_batch(self, hypothesyses: List[str], references: Optional[List[str]]) -> List[float]:
        *some code here*

    def evaluate_example(self, hypothesys: str, reference: Optional[str]) -> float:
        *some code here*

Thus, to evaluate your examples run the following code:

from luna import MetricName

metric = MetricName()
result = metric.evaluate_example("Generated bad model by example", "Gold example")
results = metric.evaluate_batch(["Generated bad model by example 1", "Generated bad model by example 2"],
                                 ["Gold example 1", "Gold example 2"])

Development

Contribute to the library

We are open for issues and pull requests. We hope that LUNA's functionality is wide enough but we believe that it can always be elaborated and improved.

Pre-commit hooks

We use pre-commit hooks to check the code before commiting.

To install the hooks run the following:

pip install pre-commit
pre-commit install

After that every commit will trigger standard checks on code style, including black, isort etc.

Tests

Tests for luna are located in the tests directory. To run them, execute:

pytest tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LUNA: a Framework for Language Understanding and Naturalness Assessment

Installation

Installation from the source

Quick start

Calculator

Integrate the evaluations

Development

Contribute to the library

Pre-commit hooks

Tests

Files

README.md

Latest commit

History

README.md

File metadata and controls

LUNA: a Framework for Language Understanding and Naturalness Assessment

Installation

Installation from the source

Quick start

Calculator

Integrate the evaluations

Development

Contribute to the library

Pre-commit hooks

Tests