APPENDIX
  1. Examples of Simple XPDL Rules

  2. Examples of Hierarchical XPDL Rules

Example 1

The rule extracts heart rate values. The query matches words "heart rate" or "pulse" or "HR" followed by a verb form of "be" or a colon or "of" followed by a two- or three-digit number.

The heart rate value forms a named group "heart_rate" and goes into the output as "HeartRate".

XPDL rule

xpdl appendix hr rule
Rule fragment
rule: heart_rate
 {
 query: phrase(0, phrase(0, heart or pulse, rate) or case(upper, [HR]) or pulse,
 be or ":" or of,
 {length(2, 3, char(num))}:heart_rate)

 result: HeartRate = $heart_rate
 }

Text

Vitals - T:97.6 BP:116/64 HR:74 RR:18 02 sat:94

Result

xpdl appendix hr result

Example 2

The rule extracts International Standard Serial Numbers (ISSN) which are unique codes used by publishers to identify a serial publication. The rule looks for the word "ISSN" followed by a sequence of four digits, a dash and another four digits or three digits and "x".

The ISSN number forms a named group "issn" and goes into the output as "Match", while the attribute "Type" receives a constant value "ISSN".

xpdl appendix issn rule
Rule fragment
 rule: ISSN
 {
 query: sfollow(case("ISSN"), not char(rb), {phrase(0, length(4, 4, char(digit)), char("-"), length(4, 4, char(digit)) or phrase(0, length(3, 3, char(digit)), "x"))}:issn)

 result: Number = $issn
 attribute: Type = "ISSN"
 }

Text

Health Data Management, v 12, n 9, p 38
September 2004
DOCUMENT TYPE: Journal ISSN: 1069-5699 (United States)
LANGUAGE: English RECORD TYPE: Fulltext
WORD COUNT: 546

Result

xpdl appendix issn result

Example 3

The rule extracts geoadministrative entities using the context. It looks for the phrase "road to" followed by the repeated occurrence (from one to five repetitions) of the following elements:

  • Title-cased word which is unknown to Morphology Dictionary and consists of alphabetic characters

  • Preposition de/do/du/da

The name of a geoadministrative entity forms a named group "geoadm", which goes into the output as "Name", while the attribute "Category" receives a constant value "geolocation".

XPDL rule

xpdl appendix road to rule
Rule fragment
 rule: road_to
 {
 query: phrase(0, road, to,
 {repeat(1, 5, char(alpha, case(title, unknownword(Morphology))) or orn(de, do, du, da))}:geoadm)

 confidence: 0.7
 result: Name = $geoadm
 attribute: Category = "Geolocation"
 }

Text

Outside of town, on the road to Aldeia da Mata, is one of the best-preserved dolmens in Portugal

Result

xpdl appendix road to result

Example 4

The rule extracts positive and negative feedback on customer service. The head rule "service_quality" looks for the words "service" or "customer service" and filters out all the texts that do not contain them. The rule has two child rules - "positive" and "negative".

The rule "positive" looks for a sequence of a positive adjective (e.g. "excellent") and the head rule match. The sequence forms a named group "m" and goes into the output as "Match", which has the attributes "Evaluation" (evaluative adjective) and "Object" (object of evaluation).

The rule "negative" looks for a sequence of a negative adjective (e.g. "terrible") and the head rule match. Just as in its sister rule, the sequence forms a named group "m" and goes into the output as "Match", which has the attributes "Evaluation" (evaluative adjective) and "Object" (object of evaluation).

XPDL rule

xpdl appendix service rule
Rule fragment
rule: service_quality
 {
 // get only texts with keywords
 query: {phrase(0, optional(customer), service)}:obj

 rule: positive
 {
 // get keywords in positive context: good, excellent, great
 // good customer service
 query: {phrase(0, {possible(orn(good, excellent, great))}:eval, $obj)}:m

 result: Match = $m
 attribute: Evaluation = $eval
 attribute: Object = $obj
 }

 rule: negative
 {
 // get keywords in negative context: bad, terrible, horrible
 // good customer service
 query: {phrase(0, {possible(orn(bad, terrible, horrible))}:eval, $obj)}:m

 result: Match = $m
 attribute: Evaluation = $eval
 attribute: Object = $obj
 }
 }

Text

The product was provided fast, excellent service. Very impressed.

Great service. Delivered as expected.

Broken after a week. Horrible customer service as well. Not a good product at all.

BAD CUSTOMER SERVICE! - Please do not waste your time with them.

Result

xpdl appendix service result

Example 5

This rule extracts facts about bankruptcy. The query in the head rule "bankruptcy_context" looks for bankruptcy-related context, which can be a noun (assigned to the named group "noun") or an adjective (assigned to the named group "adj"). The rule acts as a filter to exclude texts that are not connected with bankruptcy which increases execution speed. The head rule has two child rules - "noun" and "adjective".

The rule "noun" looks for company or organization names that occur in the noun context found by the head rule. The rule outputs the query match as "Match" and the name of the bankrupt company or organization as attribute "Company".

The rule "adjective" looks for company or organization names that occur in the adjective context found by the head rule. It has a child exception rule "negative_context", which excludes cases of future tense - when a company or organization has not become bankrupt yet. This rule outputs the query match as "Match" and the name of the bankrupt company or organization as attribute "Company".

Please note that the rule requires a parent Entity Extraction node.

XPDL rule

xpdl appendix bankruptcy rule
Rule fragment
rule: bankruptcy_context
 { // gets only texts with key words and phrases: bankruptcy, bankruptcy filing etc.
 query: {bankruptcy or phrase(0, "bankruptcy", "filing"|"auction"|"protection")}:noun
 or
 {bankrupt insolvent}:adj

 rule: noun
 { // Find companies and organizations in bankruptcy context with noun keywords
 query: {phrase(0,
 // Solyndra filed for bankruptcy
 phrase(2, {entity(companies|organizations)}:bankrupt, "file for"|"declare", $noun)
 or
 // bankruptcy of Solyndra LLC
 phrase($noun, [of], {entity(companies|organizations) / lemma(genitive)}:bankrupt)
 or
 // the Solyndra's bankruptcy
 phrase(0, lemma (genitive, {entity(companies)}:bankrupt), $noun)
 )}:m

 result: Match = $m
 attribute: Company = toentity(companies|organizations, $bankrupt, field:=Name)

 }

 rule: adjective
 // Find companies and organizations in bankruptcy context with adjective keywords
 {
 query: // bankrupt Solyndra
 {phrase({$adj}:context,
 optional(repeat(1, 4, lemma (noun_nominative adjective participle present))),
 {entity(companies|organizations)}:bankrupt)
 or
 // Solyndra, a bankrupt solar panel manufacturer
 phrase(3, {entity(companies|organizations)}:bankrupt,
 [a]|[an],
 $adj,
 "company"|"firm"|"group"|"maker"|"manufacturer"|"producer",
 not(entity(companies|organizations)))}:m

 rule_except: negative_context
 { // excludefuture tense to leave out companies that are not bankrupt yet
 query: phrase(0, "soon-to-be"|lemma(verb, "will"), $m)

 result: Match = $m
 attribute: Company = toentity(companies|organizations, $bankrupt, field:=Name)
 }
 }
 }

Text

Vermillion Inc filed for bankruptcy protection on March 30, 2009, when it had less than $1 million in cash left.

Following the bankruptcy of Lehman Brothers, we determined that the interest rate swap was no longer an effective hedge under FASB standards.

The company also bought Solibro, a unit of insolvent German solar group Q-Cells.

ReGen’s bankruptcy filing could very well be the final setback in the company’s prolonged effort to market its controversial Menaflex knee implant in the United States.

US Technologies, a virtually bankrupt investment company, was sued by investors last year for alleged fraud

Result

xpdl appendix bankruptcy result

Example 6

This rule extracts relationships between a company and its founder. The query in the rule "filter_texts" matches all the texts where relationship participants are found - entities of the type People, Company or Organization. The rule has a child rule "key_words".

The rule "key_words" looks for words and phrases that may refer to the company founders in the texts matched by the head rule ("filter_texts"). For shorter notation, the rule calls a global macro "founder" which lists some of those words. The rule has a child filter rule "singular".

The filter rule "singular" filters matches of the upper-level rule to keep only those that contain singular nouns. The rule has a child rule "founder_of".

The rule "founder_of" describes a pattern where a person name (stored in the named group "person") is followed by the optional arguments (e.g. verb "to be" or adverb "currently"), the word "founder" or its synonym (stored in the named group "founder"), the preposition "of" or "at" and a company name (stored in the named group "company").

The rule outputs the concatenated match elements "person", "founder_np" and "company" as "Match", the name of the founder as "Founder" and the name of the founded company as "Company".

Please note that the rule requires a parent Entity Extraction node.

XPDL rule

xpdl appendix formation rule
Rule fragment
  // nouns that designate a founder
 macro: founder() = orn([founder], [cofounder], "co-founder")

 rule: filter_texts
 // get only texts with main components: people, companies, organizations
 {
 query: {entity(people, field:= name)}:person or {entity(companies) or entity(organizations)}:company

 rule: key_words
 // in the texts with main components find key words and phrases: founder, founding partner etc.
 {
 query: {macro(founder) or
 phrase(0, [founding] or macro(founder), [and], orn ("partner", "member", "president", "chairman", "chair", [CEO], phrase("board", "member")))}:founder_np
 and
 ($person or $company)

 rule_filter: singular
 // leave only texts with keywords for "founder" in singular
 {
 query: $founder_np & lemma(singular)

 rule: founder_of
 // find main components in the text with keywords
 // Patterns:
 // Michael Bloomberg is the founder of Bloomberg LP
 // Michael Bloomberg, founder of Bloomberg LP
 // Michael Bloomberg is also the founder of Bloomberg LP
 // Michael Bloomberg, founder and CEO of Bloomberg LP
 {
 query: phrase({$person}:founder,
 optional([is] or [was] or [known as] or char("(") or lemma(verb, become)),
 optional(also or currently),
 optional([a] or [the]),
 {$founder_np}:founder,
 [of] or [at],
 {$company}:founder)

 result: Match = concat($founder:person, " ", $founder:founder_np, " ", $founder:company)
 attribute: Founder = toentity(people, $founder:person, field:=name)
 attribute: Company = toentity(companies|organizations, $founder:company, field:=name)
 }
 }
 }
 }

Text

Chris Reed, founder of Reed’s Inc., a natural soda company, says his company almost went bankrupt in 2001.

Beal is the founder and chairman of Beal Bank as well as affiliated companies.

Mr. Munshi is also a co-founder and Board Member of Nephrian Inc.

Result

xpdl appendix formation result