Extraction-relations-and-patterns-from-Macedonian-corpus

Project for the subject Natural Language Processing - Extraction relations and patterns from Macedonian corpus

Filtering the corpus - filter_sentences.py
- The output corpus is write in sentences_clear
Making of 22 DAWG Trie structures (sentences), where each position of the word in sentences_clear has been saved
- The output sentences are write in word-positions/dtrie_i, where i=1, to 22
Start to processing the pairs.txt file, which has been created previusly by the user (the pairs is splited with ' ')
- In relations_dawg.py is the formula for patterns legth
- In relations_dawg2.py is the formula for patterns frequency (the 10-th frequently patterns)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
filter_sentences.py		filter_sentences.py
pairs.txt		pairs.txt
read-corpus.py		read-corpus.py
read-corpus_dawg_digram.py		read-corpus_dawg_digram.py
relations.py		relations.py
relations_dawg.py		relations_dawg.py
relations_dawg2.py		relations_dawg2.py

Provide feedback