GitHub - LuminosoInsight/semeval-discriminatt: Code for SemEval-2018 Task 10: Capturing Discriminative Attributes

This is Luminoso's entry to SemEval-2018 task 10, "Capturing Discriminative Attributes".

It uses information from ConceptNet, WordNet, Wikipedia, and Google Ngrams as inputs to a simple linear classifier.

This code corresponds to run 3, a late entry to fix a show-stopping bug in producing the test results. Run 3 achieved a test F-score of 73.68%, and can be found as our entry on the post-evaluation leaderboard on CodaLab. The confidence interval of this score overlaps with the high score of 75%.

Input data

The input data is available on Zenodo. Download the Zip file and extract it into discriminatt/more-data.

Reproducing results

To reproduce this result:

Activate a Python 3 environment where you can install packages
Install ConceptNet 5.5. Be warned that this comes with a number of setup steps of its own. You won't need strictly need the database, but you will at least need its data/db/wiktionary.db file, for lemmatizing words.
Run python setup.py develop
Make sure you have the input data in discriminatt/more-data, as described above
Run python discriminatt/classifier.py

The output results come from the full classifier, followed by "ablated" versions of the classifier with features disabled, followed by a simple one-feature heuristic described in our paper.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
discriminatt		discriminatt
scripts		scripts
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Input data

Reproducing results

About

Releases

Packages

Contributors 2

Languages

LuminosoInsight/semeval-discriminatt

Folders and files

Latest commit

History

Repository files navigation

Input data

Reproducing results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages