-
Notifications
You must be signed in to change notification settings - Fork 3
Section 14 Beyond sentences
This page is based on a page of the wiki for the original SimpleNLG.
We've seen a lot of ways to create individual sentences. But what do you do if you want to put those sentences together, for example, to create a larger paragraph? SimpleNLG can do this using the DocumentElement
class. The DocumentElement
class is used to define elements that form part of a larger textual structure (documents, sections, paragraphs, sentences, lists).
To create a paragraph, you would combine some DocumentElement
instances using createParagraph
. To create a section, you would combine DocumentElement
instances using createSection
. Similarly, to create a list, you could combine these elements using createList
, and to create a document, you would use createDocument. Below, we discuss the use of createParagraph
and createSection
.
The createParagraph
method takes either an array of sentences or individual sentences added via addComponent. These sentences are then placed together as a paragraph.
As a first step, let's add the following to the import statement at the beginning of the file:
import java.util.Arrays;
This will let us pass an array to the createParagraph
method later. Okay, now we're ready to format a paragraph. First, we define some sentences:
SPhraseSpec p1 = nlgFactory.createClause("Marie", "gooien", "de bal");
SPhraseSpec p2 = nlgFactory.createClause("de bal", "vallen");
SPhraseSpec p3 = nlgFactory.createClause("Marie", "oprapen", "hem");
Next, we define these sentences as being DocumentElement
instances:
DocumentElement s1 = nlgFactory.createSentence(p1);
DocumentElement s2 = nlgFactory.createSentence(p2);
DocumentElement s3 = nlgFactory.createSentence(p3);
We can then pass these elements as a list to the createParagraph
method:
DocumentElement par1 = nlgFactory.createParagraph(Arrays.asList(s1, s2, s3)); //[1]
And then realise the paragraph:
String output = realiser.realise(par1).getRealisation();
System.out.println(output);
The resulting output is:
Marie gooit de bal. De bal valt. Marie raapt hem op.
Note that in the last couple steps, we're using the realiser differently than in all of the previous examples: Instead of using realiser.realiseSentence()
, as we did for individual sentences, we're now using the more general realiser.realise().getRealisation()
.
Let's say you want to have a bunch of paragraphs, organized together under one section header. To do this, you would use createSection()
.
With our earlier code in place, you can create a section with a header like this:
DocumentElement section = nlgFactory.createSection("Het verhaal van Marie en de bal");
A paragraph can then be added to this section:
section.addComponent(par1);
You can then realise this section as in the previous example:
String output = realiser.realise(section).getRealisation();
System.out.println(output);
→ For more examples of DocumentElements, look at testsrc/DocumentElementTest.java
.
By default, SimpleNLG produces plain text output. SimpleNLG-EnFr, on which SimpleNLG-NL was based, does not have a formatter that outputs HTML marked-up text. If that is required, please take a look at the SimpleNLG HTMLFormatter.
[1] You may also add individual sentences thus:
DocumentElement par1 = nlgFactory.createParagraph();
par1.addComponent(s1); // ...etc.