WebIsALOD: Providing Hypernymy Relations extracted from the Web as Linked Open Data

This repository contains all the code used for the WebIsALOD paper.

Abstract

Hypernymy relations are an important asset in many applications,and a central ingredient to Semantic Web ontologies. The IsA database is a large collection of such hypernymy relations extracted from the Common Crawl. In this paper, we introduce WebIsALOD, a Linked Open Data version of the IsA database, containing 11.7M hyernymy relations, each provided with rich provenance information. As the original dataset contained more than 80% wrong, noisy extractions, we run a machine learning algorithm to assign confdence scores to the individual statements.

Structure of the files

All files starting with a number are files to generate the csv files, mappings and nquad generation. The files starting with mTurk are HTML surveys used to generate the ground truth. Files with the name "webisa_{threshold}_sample_results" are the samples from corresponding thresholds together with the majority vote and the answer of each worker. webisa_1_sentence_results.csv conatins the results from the mapping to Wikipedia pages and categories.

Most of the csv files are structed as follows:

id
instance
class
frequency
pidspread
pldspread
ipremod
ilemma
ipostmod
cpremod
clemma
cpostmod
pids
plds
provids
majority voting
yes (counts)
uncertain (counts)
no (counts)
mapping instance to dbpedia page (json array)
mapping instance to dbpedia category (json array)
mapping class to dbpedia page (json array)
mapping class to dbpedia category (json array)
mapping instance to yago (string)
mapping class to yago (string)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
vad_package		vad_package
10_split_csv_with_header.sh		10_split_csv_with_header.sh
11_create_final_dataset.py		11_create_final_dataset.py
12_map_dbpedia.py		12_map_dbpedia.py
13_analyse_results.py		13_analyse_results.py
14_TransformCsvToRDF.py		14_TransformCsvToRDF.py
1_download_and_extract.sh		1_download_and_extract.sh
2_a_sentences_to_one_file.py		2_a_sentences_to_one_file.py
2_b_sentences_sort.sh		2_b_sentences_sort.sh
2_c_sentences_make_skip_file.py		2_c_sentences_make_skip_file.py
3_transform_into_csv_with_threshold.py		3_transform_into_csv_with_threshold.py
4_randomSampling.py		4_randomSampling.py
5_create_mturk_files.py		5_create_mturk_files.py
6_a_append_mturk_relation_results_to_samples.py		6_a_append_mturk_relation_results_to_samples.py
6_b_append_mturk_sentence_results_to_samples.py		6_b_append_mturk_sentence_results_to_samples.py
7_calculate_cycles.py		7_calculate_cycles.py
8_append_sentences.py		8_append_sentences.py
9_prepare_for_analysis.py		9_prepare_for_analysis.py
README.md		README.md
TypeAnalyse.xlsx		TypeAnalyse.xlsx
confidenceScores.xlsx		confidenceScores.xlsx
countAndJugementOfRelations.xlsx		countAndJugementOfRelations.xlsx
mTurk_Relation_20.html		mTurk_Relation_20.html
mTurk_Sentence.html		mTurk_Sentence.html
ontology.pptx		ontology.pptx
pattern_details.csv		pattern_details.csv
pattern_regex.csv		pattern_regex.csv
utilwebisadb.py		utilwebisadb.py
webisa_0_sample_results.csv		webisa_0_sample_results.csv
webisa_10_sample_results.csv		webisa_10_sample_results.csv
webisa_1_sample_results.csv		webisa_1_sample_results.csv
webisa_1_sentence_results.csv		webisa_1_sentence_results.csv
webisa_20_sample_results.csv		webisa_20_sample_results.csv
webisa_2_sample_results.csv		webisa_2_sample_results.csv
webisa_3_sample_results.csv		webisa_3_sample_results.csv
webisa_5_sample_results.csv		webisa_5_sample_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebIsALOD: Providing Hypernymy Relations extracted from the Web as Linked Open Data

Abstract

Structure of the files

About

Releases

Packages

Languages

sven-h/webisalod

Folders and files

Latest commit

History

Repository files navigation

WebIsALOD: Providing Hypernymy Relations extracted from the Web as Linked Open Data

Abstract

Structure of the files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages