Skip to content
Daniel Braun edited this page May 7, 2020 · 31 revisions

Introduction

SimpleNLG is Java API designed to facilitate the generation of natural language. It was originally developed at the University of Aberdeen's Department of Computing Science. This is an adaption of the English SimpleNLG library to German language.

According to the original, English library, SimpleNLG can be explained as follows (taken from SimpleNLG English tutorial and slightly adapted for this German version):

"SimpleNLG can be used to help you write a program which generates grammatically correct English sentences. It’s a library (not an application), written in Java, which performs simple and useful tasks that are necessary for natural language generation (NLG).

SimpleNLG is distributed as a stand-alone jar file, available under the Downloads tab. The jar file contains all the classes you'll need, as well as a lexicon (discussed later).

Because it’s a library, you will need to write your own Java program which makes use of SimpleNLG classes. These classes will allow you to specify the subject of a sentence (‘mein Hund’), the exact verb you want to appear in the sentence (‘jagen’), the object (‘George’), and additional complements (‘im Park’). You can also use SimpleNLG methods to indicate, for example, that you would like the verb to be in the past tense (‘jagte’). If this is already confusing, don't worry -- this tutorial will help you with all of that.

Once you have stipulated what the content of your sentence will be and expressed this information in SimpleNLG terms, SimpleNLG can assemble the parts of your sentence into a grammatical form and output the result. In our example, the resulting output would be "Mein Hund jagte George im Park.". Here, SimpleNLG has:

  1. Organized all the different elements into the correct word order for German.
  2. Capitalized the first letter of the sentence.
  3. Made the verb agree with the subject, i.e. inflect it according to the grammatical person of the subject, and set it in past tense.
  4. Put all the words together in a grammatical form.
  5. Inserted the appropriate whitespace between the words of the sentence.
  6. Put a period at the end of the sentence.

As you can see, SimpleNLG will not choose particular words for you: you will need to specify the words you want to appear in the output and their parts of speech. What SimpleNLG’s library of classes will do for you is create a grammatically correct sentence from the parts of speech you have provided it with. SimpleNLG automates some of the more mundane tasks that all natural language generation (NLG) systems need to perform:

Orthography:

  • Inserting appropriate whitespace in sentences and paragraphs.
  • Pouring – that is, inserting line breaks between words (rather than in the middle of a word) in order to fit text into rows of, for example, 80 characters (or whatever length you choose).
  • Formatting lists such as: "apples, pears and oranges."

Morphology:

  • Handling inflected forms – that is, modifying/marking a word/lexeme to reflect grammatical information such as gender, tense, number or person.

Simple Grammar:

  • Ensuring grammatical correctness by, among other things, enforcing noun-verb agreement [1].
  • Creating well-formed verb groups (i.e., verb plus auxiliaries) such as "hat gemacht".
  • Allowing the user to define parts of a sentence or phrase and having SimpleNLG gather those parts together into an appropriate syntactic structure.

For those familiar with the terminology of natural language generation (NLG), SimpleNLG is a realiser for a simple grammar.

[1] Agreement describes how a word’s form sometimes depends on other words that appear with it in a sentence. For example you don’t say "I is" in English, because "is" cannot be used when the subject is "I". The word "is" is said not to agree with the word "I". The correct form is "I am", even though the verb still has the same function and basic meaning."

Tutorial table of contents