Skip to content

Tswings/AVeriTeC-DCE

Repository files navigation

Document-level Claim Extraction and Decontextualisation for Fact-Checking

A code implementation of this paper "Document-level Claim Extraction and Decontextualisation for Fact-Checking" (ACL main conference 2024).

Data

Download raw datas from AVeriTeC.

Components

All models we rely on are pre-trained models (e.g., BertSum) and approaches that do not require training (e.g., BM25).

  • Step 1. extracts URLs (the URL linking to the original web article of the claim) available for claim extraction and corresponding text data from AVeriTeC.
    python 1_extract_texts_from_url.py
  • Step 2. generates high-quality context for decontextualisation.
    • Sentence Ranking: download from here.
    • Text Entailment: download from here
    • Candidate Answer Extraction: download spacy (python -m spacy download en_core_web_lg) to extract ambiguous information units.
    • Question Generation: download from here.
    • Question Answering: download from here.
    • QA-to-Context: follow this repository to download the QA2D model from here.
    • High-quality Context Generation: download from here.
      python 2_context_generation.py 
  • Step 3. decontextualises candidate central sentences with generated qa pairs. (Download the decontextualsation model from here):)
    python 3_decontextualisation.py 

Citation

If you use this code useful, please star our repo or consider citing:

@misc{deng2024documentlevel,
      title={Document-level Claim Extraction and Decontextualisation for Fact-Checking}, 
      author={Zhenyun Deng and Michael Schlichtkrul and Andreas Vlachos},
      year={2024},
      eprint={2406.03239},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages