Skip to content

Latest commit

 

History

History
96 lines (58 loc) · 7.09 KB

README.md

File metadata and controls

96 lines (58 loc) · 7.09 KB
Java: Build Status
C#: Build status

SimpleNLG fork by GJdV

This fork is based upon SimpleNLG. The following changes have been made:

Additions

  • Addition of feature "IS_CAPITALIZED" to allow storing of (first letter) capitalization of words, decapitalizing them during the processing (realisation) and capitalizing them again during orthographic realisation. This can come in handy when using (capitalized) names, while requiring correct pluralization.
  • Port of SimpleNLG to C#.

Port to C#

This port to C# was made for better integration in C# projects. Note that the C# code could be optimized further, however, it was chosen to stay close to the original to aid in maintenance when updates are made to the original SimpleNLG code. The ported unit tests were used to verify correctness of the port. Currently only the server test fails (see "Known Issues"). In order to remove all dependency to java, the HSQLdb containing the NIHDB lexicon was converted to SQLITE. If there is a way to interact from C# directly (without IKVM or other wrappers) to HSQLdb, please let me know. There are some placeholders (outcommented) in the code to accomodate for this functionality. To identify which mode (HSQL or SQLITE) to run, the lexicon-type (used e.g., in lexicon.properties) has been extended with the options NIH_SQLITE and NIH_HSQL; the original value NIH defaults to NIH_HSQL. These SQLITE databases are located in \SimpleNLG\srcCsharp\Resources\NIHLexicon as zip-files and should be extracted before running the code.

This port also includes the sources of LexAccess (v2016) and some dependencies from Lexical Tools (v2018). Various versions of NIH Lexicon were converted from HSQLdb to SQLITE. Please note their Terms and conditions, which also apply to the C# port of the respective code.

Known issues

  • Client/Server setup of SimpleNLG does not work (and fails the unit test); this unit test is disabled;
  • NIHDB lexicon functionality only available through the converted SQLITE databases; No proper way identified (yet) to directly use HSQLdb directly from C#
  • SQLite.Interop.dll is not available for Mono; building it using Travis did not yet work out, hence the 18 unit tests depending on SQLite fail on Mono (but pass in Visual Studio) -> moved to appveyor for C# buildstatus

 

Below you can find the original README from SimpleNLG:

 

 


 

SimpleNLG

SimpleNLG is a simple Java API designed to facilitate the generation of Natural Language. It was originally developed by Ehud Reiter, Professor at the University of Aberdeen's Department of Computing Science and co-founder of Arria NLG. The discussion list for SimpleNLG is on Google Groups.

SimpleNLG is intended to function as a "realisation engine" for Natural Language Generation architectures, and has been used successfully in a number of projects, both academic and commercial. It handles the following:

  • Lexicon/morphology system: The default lexicon computes inflected forms (morphological realisation). We believe this has fair coverage. Better coverage can be obtained by using the NIH Specialist Lexicon (which is supported by SimpleNLG).
  • Realiser: Generates texts from a syntactic form. Grammatical coverage is limited compared to tools such as KPML and FUF/SURGE, but we believe it is adequate for many NLG tasks.
  • Microplanning: Currently just simple aggregation, hopefully will grow over time.

Current release (English)

The current release of SimpleNLG is V4.4.8 (API). The "official" version of SimpleNLG only produces texts in English. However, versions for other languages are under development, see the Papers and Publications page and SimpleNLG discussion list for details.

Please note that earlier versions of SimpleNLG have different licensing, in particular versions before V4.0 cannot be used commercially.

Getting started

For information on how to use SimpleNLG, please see the tutorial and API.

If you have a technical question about using SimpleNLG, please check the SimpleNLG discussion list.

If you wish to be informed about SimpleNLG updates and events, please subscribe to the SimpleNLG announcement list.

If you wish to cite SimpleNLG in an academic publication, please cite the following paper:

If you have other questions about SimpleNLG, please contact Professor Ehud Reiter via email: [email protected].

SimpleNLG for other languages

French: A version of SimpleNLG for French is avaliable from this page.

Italian: The Italian version of SimpleNLG 4 is avaliable from this page.

Spanish: The Spanish version of SimpleNLG 4 is avaliable from this page.

German: Marcel Bollman has been working on an adaptation of SimpleNLG version 3 to German. This is available from this page. Please remember that SimpleNLG version 3 is not licensed for commercial use.

C# implementations of SimpleNLG are also avaliable. One by Gert-Jan de Vries here and a second by Nick Hodge here.

SimpleNLG License

SimpleNLG is licensed under the terms and conditions of the Mozilla Public Licence (MPL).