word
Purpose
The function is used to search for tokens by such criteria as capitalization, morphological category, token type, alphabet. It duplicates the functionality of the functions char(), lemma(), stem(), form(), case(), allowing to write a query in a more concise way.
Arguments
The function accepts single argument. When used without arguments, the function matches any token.
The first argument word_parameters is used to specify a part of speech, a modificator, capitalization, alphabet, etc.
The function also accepts the following optional named parameters:
Parameter |
Comments |
sentpart |
Is used to set a particular syntactic role. |
length |
Is used to find tokens of a certain length. |
ocr |
Is used to find tokens that were recognized by the PolyAnalyst OCR module. |
modality |
Is used to switch on/off the search for tokens expressing modality. |
negate:=yes/no/any |
Is used to switch on/off the search for negative contexts. |
junk |
Is used to find junk tokens, such as ones containing non-alphabetic characters or unusually high percentage of consonants. |
nojunk |
Excludes junk tokens, such as ones containing non-alphabetic characters or unusually high percentage of consonants. |
case |
Is used to make search case-sensitive |
alphabet |
Is used to set a specific alphabet. |
It is possible to set several named parameters using the symbols "_" (AND) and "|" (OR), for example, word(noun_upper|adjective) or word(noun_upper_adjective).
It is possible to set arguments using the named parameters lemma/stem/form, for example word(lemma:=start)
If there is a conflict between the first argument and the named parameter, the named parameter takes precedence.