This section is about words (or strings) over a set (alphabet) defined
as finite sequences of symbols (or characters) being elements of the
alphabet. Although it is often required that the underlying set/alphabet be
nonempty, finite and not a proper class, these restrictions are not made in
the current definition df-word, see, for example, s1cli:
. Note that the empty word (i.e., the empty
set) is the only word over an empty alphabet, see 0wrd0.
The set of words over is the free monoid over , where
the monoid law is concatenation and the monoid unit is the empty word.
Besides the definition of words themselves, several operations on words are
defined in this section:
<table border="1" id="word-operations">
<tr>
<th>Name</th><th>Reference</th><th>Explanation</th><th>Example</th>
<th>Remarks</th></tr>
<tr>
<td>Length (or size) of a word </td><td> df-hash: </td>
<td>Operation which determines the number of the symbols
within the word.</td>
<td> </td>
<td> This is not a special definition for words,
but for arbitrary sets.</td></tr>
<tr>
<td>First symbol of a word </td><td> df-fv: </td>
<td>Operation which extracts the first symbol of a word. The result is the
symbol at the first place in the sequence representing the word.</td>
<td> </td>
<td> This is not a special definition for words,
but for arbitrary functions with a half-open range of nonnegative
integers as domain.</td></tr>
<tr>
<td>Last symbol of a word </td><td> df-lsw: </td>
<td>Operation which extracts the last symbol of a word. The result is the
symbol at the last place in the sequence representing the word.</td>
<td> </td>
<td> Note that the index of the last symbol is less by 1 than the length of
the word.</td></tr>
<tr>
<td>Subword (or substring) of a word </td>
<td> df-substr: </td>
<td>Operation which extracts a portion of a word. The result is a
consecutive, reindexed part of the sequence representing the word.</td>
<td> </td>
<td> Note that the last index of the range defining the subword is greater
by 1 than the index of the last symbol of the subword in the sequence
of the original word.</td></tr>
<tr>
<td>Concatenation of two words </td>
<td> df-concat: </td>
<td>Operation combining two words to one new word. The result is a
combined, reindexed sequence build from the sequences representing
the two words.</td>
<td> </td>
<td> Note that the index of the first symbol of the second concatenated
word is the length of the first word in the concatenation.</td></tr>
<tr>
<td>Reversal of a word </td>
<td> df-reverse: </td>
<td>Operation which reverses the order of symbols in a word.</td>
<td> </td>
<td> </td></tr>
<tr>
<td>Cyclical shift of a word </td>
<td> df-csh: </td>
<td>Operation cyclically shifting the symbols by a number
of positions.</td>
<td> </td>
<td> </td></tr>
<tr>
<td>Splicing of a word </td>
<td> df-splice: </td>
<td>Operation which replaces portions of words.</td>
<td> </td>
<td> </td></tr>
<tr>
<td>Singleton word </td>
<td> df-s1: </td>
<td>Constructor building a word of length 1 from a symbol.</td>
<td> </td>
<td> </td></tr>
</table>
Conventions:
<ul>
<li>Words are usually represented by class variable , or if two words
are involved, by and or by and .</li>
<li>The alphabets are usually represented by class variable (because
any arbitrary set can be an alphabet), sometimes also by (especially
as codomain of the function representing a word, because the alphabet is the
set of symbols).</li>
<li>The symbols are usually represented by class variables or ,
or if two symbols are involved, by and or by and .
</li>
<li>The indices of the sequence representing a word are usually represented
by class variable , if two indices are involved (especially for
subwords) by and , or by and .</li>
<li>The length of a word is usually represented by class variables
or .</li>
<li>The number of positions by which to cyclically shift a word is usually
represented by class variables or .</li></ul>