Excluding results

The operator not is used to specify an argument which should not occur at the specific position within a sequence.

The operator not checks the absence of the argument within the scope of the function.

Example

sentence(not a, b) — matched sentence must contain "b", but must not contain "а".

phrase(not a, b) — "а" must not occur directly before "b".

phrase(3, not a, b) — "а" must not occur within 3 tokens before "b".

Task example: Find the word in specific context

The search for the word arm in news articles about clinical trials returns the following results:

pdl except not 3

In order to find the word arm as "group of patients" and not as "part of company of organization", we can exclude some context before it (for example, "business", "development", "sales" or "commercial"):

phrase(not orn(business, development, sales, commercial), arm).

This query finds the word "arm" in such contexts as: "placebo arm", "trial arm", "arm of the study", but does not find it in "the firm’s global biologics research and development arm".

pdl except not 1
Search for any arguments except specified ones

The except() function and the / operator are used to specify that any arguments except specified ones can occur on the specified position in the sequence. In most cases they can be used interchangeably.

Examples

case(title)/company = case(title, except(company)) matches all words, starting with a capital letter, except for "Company".

lemma(adjective)/fast = lemma(adjective, except(fast)) matches all adjectives, except for "fast".

term(positive)/orn(exceptional, genius) matches all words from the wordclass "positive", except for "exceptional" and "genius".

However, except() cannot be used within functions which do not support nested arguments (e.g. term(), regex(), dictword(), knownword(), unknownword(), number(), etc.). In such cases the operator "/" can be used instead.

Examples

regex(“auto\w+”)/automobile matches all words starting from "auto", except for "automobile".

Task example: Find the objects with positive evaluation

In order to find the nouns following the words from the list of positive terms, we can write a query clause:

phrase(0, term(positive), lemma(noun))

But this query also finds some phrases in which the terms from the list actually may have a different meaning depending on the context (for example, "delicate wash", "concrete floor", "sweet fruit").

To exclude them from the query users can use the "/" operator:

phrase(0, term(positive), lemma(noun))/orn("delicate wash", "concrete floor", "sweet fruit")

Difference between not and except()

not checks that the specified argument is absent.

except() finds any word except the specified argument.

Example

phrase(not orn(business, development, sales, commercial), arm) finds the word "arm", which is not preceded by the words "business", "development", "sales" or "commercial".

pdl except not 1

phrase(except(orn(business, development, sales, commercial)), arm) finds the sequence of words: any word except "business", "development", "sales" or "commercial", followed by the word "arm".

pdl except not 2