knownword

Purpose

Finds documents that contain words from a specified dictionary. This function is an alias for dictword() function.

Syntax

knownword(dict_category [, dict_name][, filter_expression1..]) = dictword(dict_category [, dict_name][, filter_expression1..])

Arguments

The function accepts several arguments.

dict_category is a required argument that specifies dictionary category. The argument is case-insensitive and accepts following values:

  • Companies

  • GeoAdministrative

  • HumanNames

  • Morphology

  • Organizations

  • Phrases

  • Statistics

  • Synonyms

  • StopLists

  • WordClasses

  • UserDictionary

dict_name is an optional argument that specifies the dictionary to look up. If the argument is omitted, all dictionaries of the specified category are looked up. Instead of the dict_name argument, one can use the named parameter dict.

Thus, knownword(Morphology, Default) = knownword(Morphology, dict:=Default).

Moreover, knownword(WordClasses) supports the parameter сlass, which specifies an entry in a dictionary of the WordClasses. It is possible to list several entries using a vertical bar, for example, knownword(WordClasses, сlass:=positive|negative).

The function may also contain optional filter expressions that apply filter on dictionary column to keep only entries that satisfy the filter criteria. Filter expression has the following syntax: name operator expression.

Name refers to the dictionary column on which filter is being applied. Dictionary columns that support filtering are listed in the table below:

Dictionary category

Column

Possible values

Examples

Companies/Organizations

Type

see dictionary entry form

dictword(companies, "type=LLC|LLP")

Country

see dictionary entry form

dictword(companies, "country=Germany|Italy")

Industry

see dictionary entry form

dictword(companies, "industry=insurance")

Geoadministrative

Category

continent, country, region, city

dictword(geoadministrative, "category=country")

World part

see dictionary entry form

dictword(geoadministrative, "category=country", "world part!=asia")

Country

see dictionary entry form

dictword(geoadministrative, "category=city", "country=Germany|Italy")

Region

see dictionary entry form

dictword(geoadministrative, "category=city", "region=california")

Population

non-negative integer

dictword(geoadministrative, "category=city", "population > 100000")

HumanNames

Type

first name, surname

dictword(humannames, "type=first name")

Gender

male, female

dictword(humannames, "gender=female")

Statistics

Support

non-negative integer

dictword(statistics, "support<100")

Frequency

non-negative integer

dictword(statistics, "frequency>10000")

Operator refers to one of the following operators:

  • =

  • <

  • >

  • !=

  • <=

  • >=

Possible values for the filter are displayed in the respective drop down list on dictionary entry form. Alternative values must be separated by a vertical bar ("|").

Filter expression must be enclosed in quotes. If multiple filter expressions are set, only dictionary entries that satisfy all criteria are matched.

The function also supports optional named parameters:

  • allow_punct:=yes/no allows or prohibits punctuation between arguments (set to "yes" by default);

  • allow_space:=yes/no allows or prohibits spaces between arguments (set to "no" by default).

  • match:=range extracts a dictionary entry as a whole fragment of text, punctuation marks included.

Returned Value

Documents matching the query.

Examples

knownword(GeoAdministrative, "category=city", "country=united states|mexico") - returns all Mexican and American cities according to the dictionaries of GeoAdministrative category, e.g. "Seattle", "Chicago", "Mexico";

knownword(HumanNames, "type=first name") returns first names;

knownword(Organizations, Default, "type!=government agency") returns all organizations but government agencies.