For some purposes, it may be helpful to retrieve all occurrences of a word in a given grammatical form (for example, to find the word "park" as a noun but not as a verb).

Another frequent task is to find all words that match specific morphological restrictions (for example, to find all singular nouns or all past tense verbs).

In order to perform theses tasks, a morphological tag should be added to the search query. A morphological tag of a word contains information about its part-of-speech and, optionally, additional morphological categories such as number, gender, tense, etc. For a full list of supported morphological categories, see Morphological Categories.

Search of the word in a given grammatical form

In order to search words in a given grammatical form only, one can use functions lemma()/form()/stem()/partofspeech() with a morphological tag as first parameter.

This feature may be especially helpful when dealing with homonyms, i.e. two or more different lexemes which have the same form but are unrelated in meaning (book (noun) — book (verb), can (noun) — can (verb), rear (adjective) — rear (verb), etc.).

Task example: find information about fines

In order to find documents that contain information about fines and penalties, the following query may be used.

pdl partofspeech 2

However this query also captures occurrences of "fine" as an adjective, which is unwanted. In order to narrow the search and exclude irrelevant results, a morphological tag should be added.

pdl partofspeech 3

If a morphological tag consists of several morphological categories (for example, part of speech + gender, number + case, etc.), their values must be concatenated by an underscore ("_").

Example

lemma(verb_present, park) matches forms of the word "park" as a present tense verb ("difficult to park"), but NOT as a past tense verb ("parked easily") or a noun ("national parks").

The values order does not matter: verb_present = present_verb.

Users can specify several alternative morphological tags separated by a vertical bar ("|") character. In this case query returns words that match at least one of the listed tags.

Example

lemma(noun|verb, fine) matches forms of the word "fine" as a noun ("a huge fine") or a verb ("bank was fined"), but NOT as an adjective ("works fine").

Morphological tag is case-insensitive: verb_past = Verb_Past = VERB_PAST.

Search for Words with the Specific Morphological Values

In order to find words with specific morphological values, one can use the functions lemma()/form()/stem()/partofspeech() with a morphological tag as single argument.

Example

partofspeech(noun) matches all nouns;

partofspeech(verb_tensepast) matches all verbs in past tense ("announced", "been", "launched", etc.);

partofspeech(noun_plural|pronoun_singular) matches plural nouns ("women", "solutions", etc.) or singular pronouns ("I", "it", "she", "my", etc.);

partofspeech(adjective_superlative) matches superlative adjectives ("largest", "best", "worst", etc.).

If called with a morphological tag as single argument, the functions lemma()/form()/stem()/partofspeech() have identical behavior and thus can be used interchangeably.

Example

partofspeech(noun) = form(noun) = stem(noun) = lemma(noun) matches all nouns;

partofspeech(verb_past) = form(verb_past) = stem(verb_past) = lemma(verb_past) matches all past tense verbs.

Task example: basic analysis of customer feedback

For a surface analysis of customer feedback on product/service it may be helpful to search for all superlative adjectives.

pdl partofspeech 4

Note

A function with several arguments is equivalent to several functions with a single argument concatenated using the or operator:

partofspeech(conference, exhibition, show) = partofspeech(conference) or partofspeech(exhibition) or partofspeech(show).