Skip to content

Commit

Permalink
renamed gdtools repo to gd_tools
Browse files Browse the repository at this point in the history
  • Loading branch information
colinbatchelor committed Feb 7, 2024
1 parent 798221f commit 3724616
Show file tree
Hide file tree
Showing 7 changed files with 11 additions and 10 deletions.
4 changes: 2 additions & 2 deletions brown_gd_to_conll.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
import re
import sys
from collections import namedtuple
from gdtools.acainn import Lemmatizer
from gdtools.acainn import Features
from gd_tools.acainn import Lemmatizer
from gd_tools.acainn import Features
from pyconll.unit import Conll

Split = namedtuple("split", "form1 upos1 xpos1 form2 upos2 xpos2")
Expand Down
2 changes: 1 addition & 1 deletion brown_gd_to_dot_ccg.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# -*- coding: utf-8 -*-
import pickle
import sys
from gdtools.acainn import Lemmatizer, Retagger, Subcat, Typer
from gd_tools.acainn import Lemmatizer, Retagger, Subcat, Typer

def tidy_word(string):
"""outputs string suitable for XMLification further down the pipeline"""
Expand Down
2 changes: 1 addition & 1 deletion checker.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import numpy as np
import pandas as pd
from gdtools.acainn import Morphology
from gd_tools.acainn import Morphology

class Checker():
# for simple matches
Expand Down
2 changes: 1 addition & 1 deletion fix_feats.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import re
import sys
import pyconll
from gdtools.acainn import Features
from gd_tools.acainn import Features

f = Features()

Expand Down
2 changes: 1 addition & 1 deletion lemmatise.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""Overwrites the lemmata in a CoNLL-U file based on the form and XPOS."""
import sys
import pyconll
from gdtools.acainn import Lemmatizer
from gd_tools.acainn import Lemmatizer

corpus = pyconll.load_from_file(sys.argv[1])
l = Lemmatizer()
Expand Down
7 changes: 4 additions & 3 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ In practice I have postprocessed the results with the following Python 3 scripts
There is one small test tree bank in `ud`:
* `gd_iomasgladh-ud-test.conllu` is a hand-built corpus from 2014 which has been converted to UD.

The lemmatiser, code to convert ARCOSG parts of speech to UD features and categorial grammar code are now in the https://github.com/colinbatchelor/gd_tools repository.


Earlier work
--
### gramaran
Expand All @@ -37,8 +40,6 @@ Each sentence has three lines beginning with hashes preceding it. These are an I

The guidelines used for the construction of the corpus in LaTeX format. Currently no special packages are used for it.



* `brown_gd_to_dot_ccg.py` takes a Brown-format corpus assuming ARCOSG tags and outputs a .ccg file
* `mend_xml.py` fixes the output of OpenCCG's ccg2xml.
* `prepareARCOSG.py` takes a local installation of the Annotated Reference Corpus of Scottish Gaelic (ARCOSG), replaces spaces within tokens with underscores and puts the results in `arcosg.pkl`.
Expand All @@ -61,5 +62,5 @@ The citation for the material in `ccg` and `gramaran` is:

Colin Batchelor

2024-02-06
2024-02-07

2 changes: 1 addition & 1 deletion test_checker.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import unittest
from gdtools.acainn import Morphology
from gd_tools.acainn import Morphology
from checker import Checker
import numpy as np

Expand Down

0 comments on commit 3724616

Please sign in to comment.