Skip to content

TextGen

Hannes Matuschek edited this page Jul 13, 2018 · 7 revisions

Rule based text generation

Although KochMorse is primarily a program to learn the Morse code using the Koch method, it also features several "tutors" to practice the learned code. At the beginning, KochMorse simply contained a list of several hundred QSOs that where sampled randomly. This provided some basic means to practice reading real QSOs. This method, however, had a serious shortcoming: These QSOs had always the same form and lacked common CW abbreviations used on air. To overcome these issues I decided to implement a relatively powerful rule-based text generation, that allowed me to randomly generate QSOs. Being a powerful tool to generate random text, I will document the XML rule description here, so users may define their own rules to practice particular situations (e.g. generating random call signs).

Basic file layout

The text generator uses XML to define the rules. All rules (and other definitions or statements) are enclosed inside <rules> tags.

<?xml version="1.0" ?>

<rules>
  <!-- Define your text generation rules here. -->
</rules>

The <rules> tag may contain <rule>, <load>, <if>, <var>, <opt>, <one-of>, <rep>, <any-letter>, <any-number>, <bt>, <bk>, <ar>, <sk>, <t>, <p>, <stop>, <ref>, <apply>.

Load rules

As text generation rules are getting quite complex quickly, it is possible to store rules into a separate file and load them using the <load> tag. The load tag can only be used directly within a <rules> tag. The <load> tag takes a file argument specifying a relative or absolute path to a file to load.

<?xml version="1.0" ?>

<rules>
 <load file="dxcc.xml"/>
</rules>

This example loads all rules defined in the dxcc.xml file within the same directory.

In fact, KochMorse comes with a library of rules that are used to generate QSOs. You can load these rules by loading the corresponding rule file like <load file=":/qso/dxcc-dl.xml"/> will load rules to sample call-sign prefixes (rule call-prefix-dl), typical names (rule name-dl) and the largest German cities (rule city-dl). Have a look at the files under /qso/ in the source code directory.

Basic text

The simplest element is the text element <t> it just contains some text to send.

Additionally <t> element may contain tags <any-letter>, <any-number>, <apply>, <bt>, <bk>, <ar>, <sk>, <if>, <one-of>, <opt>, <p>, <stop>, <ref>.

Example

The rules

<?xml version="1.0" ?>

<rules>
 <t>ABC</t>
</rules>

will just generate the text ABC and nothing else. It is also possible to apply some inline rules within a text element (see below). For example, the rules

<?xml version="1.0" ?>

<rules>
 <t>ABC<opt p="0.5">D</t></t>
</rules>

will generate ABC or ABCD with 50% chance each.

Random letters

The <any-letter> tag allows to generate a single random letter a-z. If you want to randomly select a letter from a list of letters, consider using the <one-of> tag.

Example

The rules

<?xml version="1.0" ?>

<rules>
 <rep min="5" max="5"><any-letter/></rep>
</rules>

will generate a "word" of 5 random letters.

Random numbers

The <any-number> tag allows to generate a single random number 0-9. If you want to randomly select a number from a list of numbers, consider using the <one-of> tag.

Example

The rules

<?xml version="1.0" ?>

<rules>
 <rep min="1" max="5"><any-number/></rep>
</rules>

will generate a 1-5 digit random number.

Prosigns

The <bt/>, <br/>, <ar/> and <sk/> tags can be used to generate the associated prosigns.

Pause

The <p/> tag inserts a 5-dit pause (equivalent to a single-space text<t> </t>). The <stop/> tag is similar but also causes a line-break in the output.

Optional text

The <opt> tag is like the <t> tag except it will only produce its content with some chosen probability. The probability can be specified using the optional p argument. By default this probability is 50% (i.e., ``p="0.5").

Example

The rules

<?xml version="1.0" ?>

<rules>
<t>cq cq cq de dm3mat <opt>pse </opt> k</t>
</rules>

will generate either "cq cq cq de dm3mat k" or "cq cq cq de dm3mat pse k" each with 50% probability.

Rule repetition

Sometimes it is necessary to apply rules several times to generate the desired output. The <rep> tag allows to do that. The <rep> tag requires the min attribute specifying the minimum repetitions. The optional max attribute specifies the maximum number of repetitions. If the max attribute is omitted, the rules are repeated min times.

Example

The rules

<?xml version="1.0" ?>

<rules>
<rep min="1" max="3"><any-number/></rep>
</rules>

will generate a number with 1-3 digits. To repeat a specific text a (<t>)[#basic-text] must be used. The rules

<?xml version="1.0" ?>

<rules>
<rep min="3"><t>cq </t></rep>
</rules>

will generate "cq cq cq ".

List of texts

The <one-of> tag allows to randomly apply only one rule of a list of rules to generate text. This allows to implement several alternatives to generate a text or to simply select a "word" from a list of words. The <one-of> tag may contain several '' or '' tags. Each of these tags may take an optional w argument specifying the relative weight (probability) of selecting the particular item. The difference between a <t> and <i> item, is that any text within the <i> item is ignored (including white spaces). That is, '' is just a list of rules to be applied while '' is a text with some optional in-line rules.

Example

The rules

<?xml version="1.0" ?>

<rules>
 <one-of>
  <i w="10"><any-number/> <any-number> <any-number/></i>
  <t w="20"><any-number/> <any-number> <any-number/></t>
 </one-of>
</rules>

will send three numbers between 0 and 9. The first rule has a relative weight of 10 while the second rule has a weight of 20. The first rule will therefore be applied 33% and the second rule 66% probability. Although both rules look identical, the first rule may generate 3 digit numbers of the form "000" while the second rule generates "0 0 0". As mentioned above, the <i> tag is only a list of rules and ignores any text. It therefore will ignore the whitespaces between the rules. In constrast to that, the <t> tag preserves all text including the whitespaces.

Variable definition

From time to time, it is convenient to store some generated text into a variable and reference it later. This can be done with the <var> tag. The required id attribute specifies the variable name. Variables and rules do not share a name space. That is, it is possible to define a rule and a variable with the same name. The content of a variable can be referenced using the <ref> tag, where the variable name is then specified using the var attribute.

Example

The rules

<?xml version="1.0" ?>

<rules>
 <var id="call">dl<any-number/><any-letter/><any-letter/><opt><any-letter/></opt></var>
 
 <ref var="call"/>
</rules>

will generate a German callsign and store it in variable call and reference it later.

Conditional rule

It is also possible to apply certain rules only if a variable is defined or has a specific value. This can be done using the <if> tag. It takes the var attribute specifying the variable. And an optional matches attribute. If the matches attribute is missing, the enclosed rules are only applied if the specified variable is defined (irrespective of its content). If the matches attribute is set, the enclosed rules are only applied if the variable content matches the specified value exactly.

Example

The rules

<?xml version="1.0" ?>

<rules>
<var id="name">hannes</var>

<if var="name"><t>fb dr <ref var="name"/></t></if>
</rules>

generates the text "fb dr hannes" if the variable name is defined.

Rule definition

Although it is possible to implement any text generation with the build-in rules listed above, it is frequently much more convenient to define own specialized rules and apply them when needed. This can be done with the <rule> and <apply> tags. The <rule> tag defines a new rule and the <apply> tag applies these rules. The <rule> tag takes a single attribute id specifying the name of the rule. The body of the rule just contains a list of other rules to apply. The <apply> tag just takes a single attribute rule specifying the name of the rule to apply.

Example

The rules

<?xml version="1.0" ?>

<rules>
  <rule id="cq">
   <rep min="3"><t>cq </t></rep>
  <rule>
  
  <apply rule="cq"/>
</rule>

Will first define the rule cq which is then applied later to generate the text "cq cq cq ".