Icon

Partager Envoyer

(Document) NER Oonai POS Tagging Relations extraction

Information Extraction with Oonai

It is possible to perform Parts-of-speech Tagging, Named Entities Recognition and Relations Extraction through Oonai.

The underlying functions are provided either by SpaCy or by Nltk + Stanford NLP. The second one is the best option to support custom trained Models in different languages. Basic pre-trained Models are provided for English, French and Spanish Languages. A custom Dataset is also provided for training a french Model.

All required files are handed through servetext.py, and found in the following directories:

— ner/ : use ./test.py to check that NER is working properly ; use ./train.py to train a new model after ner/stanford-ner/train/training-dataset.bio (our custom french corpus). At the end of training, the model will be saved as ner/stanford-ner/ner-model-fr.gz.

— pos/ : use ./test.py to check that POS Tagging is working properly

— rel/ : use ./test.py to check that Relations extraction is working properly

What follows is a comprehensive list of reference information regarding available Tags (both for NE/POS).

/ner : holds the required files and functions for Named Entities Recognition with Python Nltk + Stanford NER Tagger


Available functions include converting Stanford NER Tagger Output to IOB/BIO, to Nltk Tree, grouping multi-terms entities from a tagged array and from a BIO tagged Tree (last one is better).

The Named Entities that can be identified depend on the training set used when building the model. On a basic level, it is customary to tag 3 different named entities (this is the case for instance with enp_FR.bnf.bio):

NE Tags used in Europeana corpus of French BNF Articles

Tag
Description
I-PER Person
I-LOC Location
I-ORG Organization
O Other

 

NE Classes per corpora for extraction of relations with the nltk.sem.relextract module

# Dictionary that associates corpora with NE classes
NE_CLASSES = {
    'ieer': [
        'LOCATION',
        'ORGANIZATION',
        'PERSON',
        'DURATION',
        'DATE',
        'CARDINAL',
        'PERCENT',
        'MONEY',
        'MEASURE',
    ],
    'conll2002': ['LOC', 'PER', 'ORG'],
    'ace': [
        'LOCATION',
        'ORGANIZATION',
        'PERSON',
        'DURATION',
        'DATE',
        'CARDINAL',
        'PERCENT',
        'MONEY',
        'MEASURE',
        'FACILITY',
        'GPE',
    ],
}

(taken from Nltk source code)

NE Tags used in our custom corpus

Tag
Description
I-PER Person
I-LOC Location
I-GPE Geopolitical Entity
I-MEA Measure
I-ORG Organization
I-PER Percentage
I-TIM Time
O Other

 

/pos : holds the required files for Parts-of-Speech Tagging with Python Nltk + Stanford POS Tagger


Different tags may be used for POS Tagging: these are the most common. In any case, you should prefer Universal POS Tags.

POS tags used in the Penn Treebank Project

Number
Tag
Description
1. CC Coordinating conjunction
2. CD Cardinal number
3. DT Determiner
4. EX Existential there
5. FW Foreign word
6. IN Preposition or subordinating conjunction
7. JJ Adjective
8. JJR Adjective, comparative
9. JJS Adjective, superlative
10. LS List item marker
11. MD Modal
12. NN Noun, singular or mass
13. NNS Noun, plural
14. NNP Proper noun, singular
15. NNPS Proper noun, plural
16. PDT Predeterminer
17. POS Possessive ending
18. PRP Personal pronoun
19. PRP$ Possessive pronoun
20. RB Adverb
21. RBR Adverb, comparative
22. RBS Adverb, superlative
23. RP Particle
24. SYM Symbol
25. TO to
26. UH Interjection
27. VB Verb, base form
28. VBD Verb, past tense
29. VBG Verb, gerund or present participle
30. VBN Verb, past participle
31. VBP Verb, non-3rd person singular present
32. VBZ Verb, 3rd person singular present
33. WDT Wh-determiner
34. WP Wh-pronoun
35. WP$ Possessive wh-pronoun
36. WRB Wh-adverb

 

Universal POS tags (used)

These tags mark the core part-of-speech categories. To distinguish additional lexical and grammatical properties of words, use the universal features.

Open class words Closed class words Other
ADJ ADP PUNCT
ADV AUX SYM
INTJ CCONJ X
NOUN DET  
PROPN NUM  
VERB PART  
  PRON  
  SCONJ  

 

Dictionnary

ADJ: adjective

Definition

Adjectives are words that typically modify nouns and specify their properties or attributes:

The oldest French bridge

They may also function as predicates, as in:

The car is green.

The ADJ tag is intended for ordinary adjectives only. See DET for determiners and NUM for (cardinal) numbers. ADJ is used for “proper adjectives” such as European.

Numbers vs. Adjectives: In general, cardinal numbers receive the part of speech NUM, while ordinal numbers (more precisely adjectival ordinal numerals) receive the tag ADJ.

There are words that may traditionally be called numerals in some languages (e.g., Czech) but which are treated as adjectives in our universal tagging scheme. In particular, the adjectival ordinal numerals (note: Czech also has adverbial ones) behave both morphologically and syntactically as adjectives and are tagged ADJ.

Nouns vs. Adjectives: A noun modifying another noun to form a compound noun is given the tag NOUN not ADJ.

Participles: Participles are word forms that may share properties and usage of any of adjectives, nouns, and verbs. Depending on the language and context, they may be classified as any of ADJ, NOUN or VERB.

Adjectival modifiers of adjectives: In general, an ADJ is modified by an ADV (very strong). However, sometimes a word modifying an ADJ is still regarded as an ADJ. These cases include: (i) ordinal numeral modifiers of a superlative adjective (the third oldest bridge) and (ii) when a pair of adjectives form a compound adjectival modifier (an African American mayor).

Examples

  • big

  • old

  • green

  • African

  • incomprehensible

  • first, second, third

References

ADP: adposition

Definition

Adposition is a cover term for prepositions and postpositions. Adpositions belong to a closed set of items that occur before (preposition) or after (postposition) a complement composed of a noun phrase, noun, pronoun, or clause that functions as a noun phrase, and that form a single structure with the complement to express its grammatical and semantic relation to another unit within a clause.

In many languages, adpositions can take the form of fixed multiword expressions, such as in spite of, because of, thanks to. The component words are then still tagged according to their basic use (in is ADP, spite is NOUN, etc.) and their status as multiword expressions are accounted for in the syntactic annotation.

Note that in Germanic languages, some prepositions may also function as verbal particles, as in give in or hold on. They are still tagged ADP and not PART.

Examples

  • in

  • to

  • during

References

ADV: adverb

Definition

Adverbs are words that typically modify verbs for such categories as time, place, direction or manner. They may also modify adjectives and other adverbs, as in very briefly or arguably wrong.

There is a closed subclass of pronominal adverbs that refer to circumstances in context, rather than naming them directly; similarly to pronouns, these can be categorized as interrogative, relative, demonstrative etc. Pronominal adverbs also get the ADV part-of-speech tag but they are differentiated by additional features.

Note that in Germanic languages, some adverbs may also function as verbal particles, as in write down or end up. They are still tagged ADV and not PART.

Note that there are words that may be traditionally called numerals in some languages (e.g. Czech) but they are treated as adverbs in our universal tagging scheme. In particular, adverbial ordinal numerals ([cs] poprvé “for the first time”) and multiplicative numerals (e.g. once, twice) behave syntactically as adverbs and are tagged ADV.

Note that there are verb forms such as transgressives or adverbial participles that share properties and usage of adverbs and verbs. Depending on language and context, they may be classified as either VERB or ADV.

Examples

  • very

  • well

  • exactly

  • tomorrow

  • up, down

  • interrogative adverbs: where, when, how, why

  • demonstrative adverbs: here, there, now, then

  • indefinite adverbs: somewhere, sometime, anywhere, anytime

  • totality adverbs: everywhere, always

  • negative adverbs: nowhere, never

References

AUX: auxiliary

Definition

An auxiliary is a function word that accompanies the lexical verb of a verb phrase and expresses grammatical distinctions not carried by the lexical verb, such as person, number, tense, mood, aspect, voice or evidentiality. It is often a verb (which may have non-auxiliary uses as well) but many languages have nonverbal TAME markers and these should also be tagged AUX. The class AUX also include copulas (in the narrow sense of pure linking words for nonverbal predication).

Modal verbs may count as auxiliaries in some languages (English). In other languages their behavior is not too different from the main verbs and they are thus tagged VERB.

Note that not all languages have grammaticalized auxiliaries, and even where they exist the dividing line between full verbs and auxiliaries can be expected to vary between languages. Exactly which words are counted as AUX should be part of the language-specific documentation.

Examples

  • Tense auxiliaries: has (done), is (doing), will (do)

  • Passive auxiliaries: was (done), got (done)

  • Modal auxiliaries: should (do), must (do)

  • Verbal copulas: He is a teacher.

References

CCONJ: coordinating conjunction

Definition

A coordinating conjunction is a word that links words or larger constituents without syntactically subordinating one to the other and expresses a semantic relationship between them.

For subordinating conjunctions, see SCONJ.

Examples

  • and

  • or

  • but

References

DET: determiner

Definition

Determiners are words that modify nouns or noun phrases and express the reference of the noun phrase in context. That is, a determiner may indicate whether the noun is referring to a definite or indefinite element of a class, to a closer or more distant element, to an element belonging to a specified person or thing, to a particular number or quantity, etc.

Determiners under this definition include both articles and pro-adjectives (pronominal adjectives), which is a slightly broader sense than what is usually regarded as determiners in English. In particular, there is no general requirement that a nominal can be modified by at most one determiner, although some languages may show a strong tendency towards such a constraint. (For example, an English nominal usually allows only one DET modifier, but there are occasional cases of addeterminers, which appear outside the usual determiner, such as [en] all in all the children survived. In such cases, both all and the are given the POS DET.)

Note that the DET tag includes (pronominal) quantifiers (words like many, few, several), which are included among determiners in some languages but may belong to numerals in others. However, cardinal numerals in the narrow sense (one, five, hundred) are not tagged DET even though some authors would include them in quantifiers. Cardinal numbers have their own tag NUM.

Also note that the notion of determiners is unknown in traditional grammar of some languages (e.g. Czech); words equivalent to English determiners may be traditionally classified as pronouns and/or numerals in these languages. In order to annotate the same thing the same way across languages, the words satisfying our definition of determiners should be tagged DET in these languages as well.

It is not always crystal clear where pronouns end and determiners start. Unlike in UD v1 it is no longer required that they are told apart solely on the base of the context. The words can be pre-classified in the dictionary as either PRON or DET, based on their typical syntactic distribution (and morphology, when applicable). Language-specific documentation should list all determiners (it is a closed class) and point out ambiguities, if any.

See also general principles on pronominal words for more tips on how to define determiners. In particular:

  • Articles (the, a, an) are always tagged DET; their PronType is Art.

  • Pronominal numerals (quantifiers) are tagged DET; besides PronType, they also use the NumType feature.

  • Words that behave similar to adjectives are DET. Similar behavior means:

    • They are more likely to be used attributively (modifying a noun phrase) than substantively (replacing a noun phrase). They may occur alone, though. If they do, it is either because of ellipsis, or because the hypothetical modified noun is something unspecified and general, as in All [visitors] must pay.

    • Their inflection (if applicable) is similar to that of adjectives, and distinct from nouns. They agree with the nouns they modify. Especially the ability to inflect for gender is typical for adjectives and determiners. (Gender of nouns is determined lexically and determiners may be required by the grammar to agree with their nouns in gender; therefore they need to inflect for gender.)

  • Possessives vary across languages. In some languages the above tests put them in the DET category. In others, they are more like a normal personal pronoun in a specific case (often the genitive), or a personal pronoun with an adposition; they are tagged PRON.

Examples

  • articles (a closed class indicating definiteness, specificity or givenness): a, an, the

  • possessive determiners (which modify a nominal): [cs] můj, tvůj, jeho, její, náš, váš, jejich; [en] my, your

  • demonstrative determiners: this as in I saw this car yesterday.

  • interrogative determiners: which as in Which car do you like?”

  • relative determiners: which as in “I wonder which car you like.”

  • quantity determiners (quantifiers): indefinite any, universal: all, and negative no as in “We have no cars available.”

References

INTJ: interjection

Definition

An interjection is a word that is used most often as an exclamation or part of an exclamation. It typically expresses an emotional reaction, is not syntactically related to other accompanying expressions, and may include a combination of sounds not otherwise found in the language.

Note that words primarily belonging to another part of speech retains their original category when used in exclamations. For example, God is a NOUN even in exclamatory uses.

As a special case of interjections, we recognize feedback particles such as yes, no, uhuh, etc.

Examples

  • psst

  • ouch

  • bravo

  • hello

References

NOUN: noun

Definition

Nouns are a part of speech typically denoting a person, place, thing, animal or idea.

The NOUN tag is intended for common nouns only. See PROPN for proper nouns and PRON for pronouns.

Note that some verb forms such as gerunds and infinitives may share properties and usage of nouns and verbs. Depending on language and context, they may be classified as either VERB or NOUN.

Examples

  • girl

  • cat

  • tree

  • air

  • beauty

References

NUM: numeral

Definition

A numeral is a word, functioning most typically as a determiner, adjective or pronoun, that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.

Note that cardinal numerals are covered by NUM whether they are used as determiners or not (as in Windows Seven) and whether they are expressed as words (four), digits (4) or Roman numerals (IV). Other words functioning as determiners (including quantifiers such as many and few) are tagged DET.

Note that there are words that may be traditionally called numerals in some languages (e.g. Czech) but which are not tagged NUM. Such non-cardinal numerals belong to other parts of speech in our universal tagging scheme, based mainly on syntactic criteria: ordinal numerals are adjectives (first, second, third) or adverbs ([cs] poprvé “for the first time”), multiplicative numerals are adverbs (once, twice) etc.

Examples

  • 0, 1, 2, 3, 4, 5, 2014, 1000000, 3.14159265359

  • one, two, three, seventy-seven

  • I, II, III, IV, V, MMXIV

References

PART: particle

Definition

Particles are function words that must be associated with another word or phrase to impart meaning and that do not satisfy definitions of other universal parts of speech (e.g. adpositions, coordinating conjunctions, subordinating conjunctions or auxiliary verbs). Particles may encode grammatical categories such as negation, mood, tense etc. Particles are normally not inflected, although exceptions may occur.

Note that the PART tag does not cover so-called verbal particles in Germanic languages, as in give in or end up. These are adpositions or adverbs by origin and are tagged accordingly ADP or ADV. Separable verb prefixes in German are treated analogically.

Note that not all function words that are traditionally called particles in Japanese automatically qualify for the PART tag. Some of them do, e.g. the question particle か / ka. Others (e.g. に / ni, の / no) are parallel to adpositions in other languages and should thus be tagged ADP.

In general, the PART tag should be used restrictively and only when no other tag is possible. The the language-specific documentation should list the words classified as PART in the given language.

Examples

  • Possessive marker: [en] ‘s

  • Negation particle: [en] not; [de] nicht

  • Question particle: [ja] か / ka (adding this particle to the end of a clause turns the clause into a question); [tr] mu

  • Sentence modality: [cs] ať, kéž, nechť (Let’s do it! If only I could do it over. May you have an enjoyable stay!)

References

PRON: pronoun

Definition

Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context.

Pronouns under this definition function like nouns. Note that some languages traditionally extend the term pronoun to words that substitute for adjectives. Such words are not tagged PRON under our universal scheme. They are tagged as determiners in order to annotate the same thing the same way across languages.

It is not always crystal clear where pronouns end and determiners start. Unlike in UD v1 it is no longer required that they are told apart solely on the base of the context. The words can be pre-classified in the dictionary as either PRON or DET, based on their typical syntactic distribution (and morphology, when applicable). Language-specific documentation should list all pronouns (it is a closed class) and point out ambiguities, if any.

See also general principles on pronominal words for more tips on how to define pronouns. In particular:

  • Non-possessive personal, reflexive or reciprocal pronouns are always tagged PRON.

  • Possessives vary across languages. In some languages the above tests put them in the DET category. In others, they are more like a normal personal pronoun in a specific case (often the genitive), or a personal pronoun with an adposition; they are tagged PRON.

Examples

  • personal pronouns: I, you, he, she, it, we, they

  • reflexive pronouns: myself, yourself, himself, herself, itself, ourselves, yourselves, theirselves

  • interrogative pronouns: who, what as in What do you think?

  • relative pronouns: who, what as in I wonder what you think. (Note, however, that some relative clause introducing words, such as [en] that are better analyzed as subordinating conjunctions (otherwise known as “complementizers” in the literature), and so are tagged as SCONJ.)

  • indefinite pronouns: somebody, something, anybody, anything

  • total pronouns: everybody, everything

  • negative pronouns: nobody, nothing

  • possessive pronouns (which usually stand alone as a nominal): mine, yours, (his), hers, (its), ours, theirs

References

PROPN: proper noun

Definition

A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object.

Note that PROPN is only used for the subclass of nouns that are used as names and that often exhibit special syntactic properties (such as occurring without an article in the singular in English). When other phrases or sentences are used as names, the component words retain their original tags. For example, in Cat on a Hot Tin Roof, Cat is NOUN, on is ADP, a is DET, etc.

A fine point is that it is not uncommon to regard words that are etymologically adjectives or participles as proper nouns when they appear as part of a multiword name that overall functions like a proper noun, for example in the Yellow Pages, United Airlines or Thrall Manufacturing Company. This is certainly the practice for the English Penn Treebank tag set.

Acronyms of proper nouns, such as UN and NATO, should be tagged PROPN. Even if they contain numbers (as in various product names), they are tagged PROPN and not SYM: 130XE, DC10, DC-10. However, if the token consists entirely of digits (like 7 in Windows 7), it is tagged NUM.

Examples

  • Mary, John

  • London

  • NATO, HBO

References

PUNCT: punctuation

Definition

Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text.

Punctuation is not taken to include logograms such as $, %, and §, which are instead tagged as SYM.

Examples

  • Period: .

  • Comma: ,

  • Parentheses: ()

References

SCONJ: subordinating conjunction

Definition

A subordinating conjunction is a conjunction that links constructions by making one of them a constituent of the other. The subordinating conjunction typically marks the incorporated constituent which has the status of a (subordinate) clause.

We follow Loos et al. 2003 in recognizing these three subclasses as subordinating conjunctions:

  • Complementizers, like [en] that or if

  • Adverbial clause introducers, like [en] when, since, or before (when introducing a clause not a nominal)

  • Relativizers, like [he] še. (Note that these words, which simply introduce a relative caluse, and normally don’t inflect, need to be distinguished from relative or resumptive pronouns, which have a nominal function within the relative clause and which we analyze as PRON.)

For coordinating conjunctions, see CCONJ.

Examples

  • that as in I believe that he will come.

  • if

  • while

References

SYM: symbol

Definition

A symbol is a word-like entity that differs from ordinary words by form, function, or both.

Many symbols are or contain special non-alphanumeric characters, similarly to punctuation. What makes them different from punctuation is that they can be substituted by normal words. This involves all currency symbols, e.g. $ 75 is identical to seventy-five dollars.

Mathematical operators form another group of symbols.

Another group of symbols is emoticons and emoji.

Strings that consists entirely of alphanumeric characters are not symbols but they may be proper nouns: 130XE, DC10; others may be tagged PROPN (rather than SYM) even if they contain special characters: DC-10. Similarly, abbreviations for single words are not symbols but are assigned the part of speech of the full form. For example, Mr. (mister), kg (kilogram), km (kilometer), Dr (Doctor) should be tagged nouns. Acronyms for proper names such as UN and NATO should be tagged as proper nouns.

Characters used as bullets in itemized lists (•, ‣) are not symbols, they are punctuation.

Examples

  • $, %, §, ©

  • +, −, ×, ÷, =, <, >

  • :), ♥‿♥, 😝

  • john.doe@universal.org, http://universaldependencies.org/, 1-800-COMPANY

VERB: verb

Definition

A verb is a member of the syntactic class of words that typically signal events and actions, can constitute a minimal predicate in a clause, and govern the number and types of other constituents which may occur in the clause. Verbs are often associated with grammatical categories like tense, mood, aspect and voice, which can either be expressed inflectionally or using auxilliary verbs or particles.

Note that the VERB tag covers main verbs (content verbs) but it does not cover auxiliary verbs and verbal copulas (in the narrow sense), for which there is the AUX tag. Modal verbs may be considered VERB or AUX, depending on their behavior in the given language. Language-specific documentation should specify which verbs are tagged AUX in which contexts.

Note that participles are word forms that may share properties and usage of adjectives and verbs. Depending on language and context, they may be classified as either VERB or ADJ.

Note that some verb forms such as gerunds and infinitives may share properties and usage of nouns and verbs. Depending on language and context, they may be classified as either VERB or NOUN.

Note that there are verb forms such as converbs (transgressives) or adverbial participles that share properties and usage of adverbs and verbs. Depending on language and context, they may be classified as either VERB or ADV.

Examples

  • run, eat

  • runs, ate

  • running, eating

References

X: other

Definition

The tag X is used for words that for some reason cannot be assigned a real part-of-speech category. It should be used very restrictively.

A special usage of X is for cases of code-switching where it is not possible (or meaningful) to analyze the intervening language grammatically (and where the dependency relation flat:foreign is typically used in the syntactic analysis). This usage does not extend to ordinary loan words which should be assigned a normal part-of-speech. For example, in he put on a large sombrero, sombrero is an ordinary NOUN.

Examples

  • And then he just xfgh pdl jklw

Extracting Triplets from a tagged Tree

Three main different types of Tree representations:of tagged sentences exist: the first one is the most popular: it is used by the Treebank Parsers (Stanford NLP, OpenNLP) and uses the Penn Treebank Tagset (see above); the second one follows to the the Link Grammar (https://www.link.cs.cmu.edu/link/) and includes the French Tagged Tree: it has its own Tagset (see below: Link Grammar). The third one is the Minipar Parse Tree. Tests indicate that a Tree representation of a sentence with the Link Grammar can be parsed quicker, at least for english and french languages (see attached paper)

Parsing is dependant on the Tree structure and Labels.

Link Grammar

A connects pre-noun ("attributive") adjectives to following nouns: "The BIG DOG chased me", "The BIG BLACK UGLY DOG chased me".

AA is used in the construction "How [adj] a [noun] was it?". It connects the adjective to the following "a".

AF connects adjectives to verbs in cases where the adjective is fronted, such as questions and indirect questions: "How BIG IS it?"

AL connects a few determiners like "all" or "both" to following determiners: "ALL THE people are here".

AM connects "as" to "much" or "many": "I don't go out AS MUCH now".

AN connects noun-modifiers to following nouns: "The TAX PROPOSAL was rejected".

AZ connects the word "as" back to certain verbs that can take "[obj] as [adj]" as a complement: "He VIEWED him AS stupid".

B serves various functions involving relative clauses and questions. It connects transitive verbs back to their objects in relative clauses, questions, and indirect questions ("The DOG we CHASED", "WHO did you SEE?"); it also connects the main noun to the finite verb in subject-type relative clauses ("The DOG who CHASED me was black").

BI connects forms of the verb "be" to certain idiomatic expressions: for example, cases like "He IS PRESIDENT of the company".

BT is used with time expressions acting as fronted objects: "How many YEARS did it LAST?".

BW connects "what" to various verbs like "think", which are not really transitive but can connect back to "what" in questions: "WHAT do you THINK?"

C links conjunctions to subjects of subordinate clauses ("He left WHEN HE saw me"). it also links certain verbs to subjects of embedded clauses ("He SAID HE was sorry").

CC connects clauses to following coordinating conjunctions ("SHE left BUT we stayed").

CO connects "openers" to subjects of clauses: "APPARENTLY / ON Tuesday , THEY went to a movie".

CP connects paraphrasing or quoting verbs to the wall (and, indirectly, to the paraphrased expression): "///// That is untrue, the spokesman SAID."

CQ connects to auxiliaries in comparative constructions involving s-v inversion: "SHE has more money THAN DOES Joe".

CX is used in comparative constructions where the right half of the comparative contains only an auxiliary: "She has more money THAN he DOES".

D connects determiners to nouns: "THE DOG chased A CAT and SOME BIRDS".

DD connects definite determiners ("the", "his") to certain things like number expressions and adjectives acting as nouns: "THE POOR", "THE TWO he mentioned".

DG connects the word "The" with proper nouns: "the Riviera", "the Mississippi".

DP connects possessive determiners to gerunds: "YOUR TELLING John to leave was stupid".

DT connects determiners to nouns in idiomatic time expressions: "NEXT WEEK", "NEXT THURSDAY".

E is used for verb-modifying adverbs which precede the verb: "He is APPARENTLY LEAVING".

EA connects adverbs to adjectives: "She is a VERY GOOD player".

EB connects adverbs to forms of "be" before an object or prepositional phrase: "He IS APPARENTLY a good programmer".

EC connects adverbs to comparative adjectives: "It is MUCH BIGGER"

EE connects adverbs to other adverbs: "He ran VERY QUICKLY".

EF connects the word "enough" to preceding adjectives and adverbs: "He didn't run QUICKLY ENOUGH".

EI connects a few adverbs to "after" and "before": "I left SOON AFTER I saw you".

EL connects certain words to the word "else": something / everything / anything / nothing , somewhere (etc.), and someone (etc.).

EN connects certain adverbs to expressions of quantity: "The class has NEARLY FIFTY students".

ER is used the expression "The x-er..., the y-er...". it connects the two halfs of the expression together, via the comparative words (e.g. "The FASTER it is, the MORE they will like it").

EZ connects certain adverbs to the word "as", like "just" and "almost": "You're JUST AS good as he is."

FL connects "for" to "long": "I didn't wait FOR LONG".

FM connects the preposition "from" to various other prepositions: "We heard a scream FROM INSIDE the house".

G connects proper noun words together in series: "GEORGE HERBERT WALKER BUSH is here."

GN (stage 2 only) connects a proper noun to a preceding common noun which introduces it: "The ACTOR Eddie MURPHY attended the event".

H connects "how" to "much" or "many": "HOW MUCH money do you have".

I connects infinitive verb forms to certain words such as modal verbs and "to": "You MUST DO it", "I want TO DO it".

ID is a special class of link-types generated by the parser, with arbitrary four-letter names (such as "IDBT"), to connect together words of idiomatic expressions such as "at_hand" and "head_of_state".

IN connects the preposition "in" to certain time expressions: "We did it IN DECEMBER".

J connects prepositions to their objects: "The man WITH the HAT is here".

JG connects certain prepositions to proper-noun objects: "The Emir OF KUWAIT is here".

JQ connects prepositions to question-word determiners in "prepositional questions": "IN WHICH room were you sleeping?"

JT connects certain conjunctions to time-expressions like "last week": "UNTIL last WEEK, I thought she liked me".

K connects certain verbs with particles like "in", "out", "up" and the like: "He STOOD UP and WALKED OUT".

L connects certain determiners to superlative adjectives: "He has THE BIGGEST room".

LE is used in comparative constructions to connect an adjective to the second half of the comparative expression beyond a complement phrase: "It is more LIKELY that Joe will go THAN that Fred will go".

LI connects certain verbs to the preposition "like": "I FEEL LIKE a fool."

M connects nouns to various kinds of post-noun modifiers: prepositional phrases ("The MAN WITH the hat"), participle modifiers ("The WOMAN CARRYING the box"), prepositional relatives ("The MAN TO whom I was speaking"), and other kinds.

MF is used in the expression "Many people were injured, SOME OF THEM children".

MG allows certain prepositions to modify proper nouns: "The EMIR OF Kuwait is here".

MV connects verbs and adjectives to modifying phrases that follow, like adverbs ("The dog RAN QUICKLY"), prepositional phrases ("The dog RAN IN the yard"), subordinating conjunctions ("He LEFT WHEN he saw me"), comparatives, participle phrases with commas, and other things.

MX connects modifying phrases with commas to preceding nouns: "The DOG, a POODLE, was black". "JOHN, IN a black suit, looked great".

N connects the word "not" to preceding auxiliaries: "He DID NOT go".

ND connects numbers with expressions that require numerical determiners: "I saw him THREE WEEKS ago".

NF is used with NJ in idiomatic number expressions involving "of": "He lives two THIRDS OF a mile from here".

NI is used in a few special idiomatic number phrases: "I have BETWEEN 5 AND 20 dogs".

NJ is used with NF in idiomatic number expressions involving "of": "He lives two thirds OF a MILE from here".

NN connects number words together in series: "FOUR HUNDRED THOUSAND people live here".

NO is used on words which have no normal linkage requirement, but need to be included in the dictionary, such as "um" and "ah".

NR connects fraction words with superlatives: "It is the THIRD BIGGEST city in China".

NS connects singular numbers (one, 1, a) to idiomatic expressions requiring number determiners: "I saw him ONE WEEK ago".

NT connects "not" to "to": "I told you NOT TO come".

NW is used in idiomatic fraction expressions: "TWO THIRDS of the students were women".

O connects transitive verbs to their objects, direct or indirect: "She SAW ME", "I GAVE HIM the BOOK".

OD is used for verbs like "rise" and "fall" which can take expressions of distance as complements: "It FELL five FEET".

OF connects certain verbs and adjectives to the word "of": "She ACCUSED him OF the crime", "I'm PROUD OF you".

ON connectors the word "on" to dates or days of the week in time expressions: "We saw her again ON TUESDAY".

OT is used for verbs like "last" which can take time expressions as objects: "It LASTED five HOURS".

OX is an object connector, analogous to SF, used for special "filler" words like "it" and "there" when used as objects: "That MAKES IT unlikely that she will come".

P connects forms of the verb "be" to various words that can be its complements: prepositions, adjectives, and passive and progressive participles: "He WAS [ ANGRY / IN the yard / CHOSEN / RUNNING ]".

PF is used in certain questions with "be", when the complement need of "be" is satisfied by a preceding question word: "WHERE are you?", "WHEN will it BE?"

PP connects forms of "have" with past participles: "He HAS GONE".

Q is used in questions. It connects the wall to the auxiliary in simple yes-no questions ("///// DID you go?"); it connects the question word to the auxiliary in where-when-how questions ("WHERE DID you go").

QI connects certain verbs and adjectives to question-words, forming indirect questions: "He WONDERED WHAT she would say".

R connects nouns to relative clauses. In subject-type relatives, it connects to the relative pronoun ("The DOG WHO chased me was black"); in object-type relatives, it connects either to the relative pronoun or to the subject of the relative clause ("The DOG THAT we chased was black", "The DOG WE chased was black").

RS is used in subject-type relative clauses to connect the relative pronoun to the verb: "The dog WHO CHASED me was black".

RW connects the right-wall to the left-wall in cases where the right-wall is not needed for punctuation purposes.

S connects subject nouns to finite verbs: "The DOG CHASED the cat": "The DOG [ IS chasing / HAS chased / WILL chase ] the cat".

SF is a special connector used to connect "filler" subjects like "it" and "there" to finite verbs: "THERE IS a problem", "IT IS likely that he will go".

SFI connects "filler" subjects like "it" and "there" to verbs in cases with subject-verb inversion: "IS THERE a problem?", "IS IT likely that he will go?"

SI connects subject nouns to finite verbs in cases of subject-verb inversion: "IS JOHN coming?", "Who DID HE see?"

SX connects "I" to special first-person verbs lke "was" and "am".

SXI connects "I" to first-person verbs in cases of s-v inversion.

TA is used to connect adjectives like "late" to month names: "We did it in LATE DECEMBER".

TD connects day-of-the-week words to time expressions like "morning": "We'll do it MONDAY MORNING".

TH connects words that take "that [clause]" complements with the word "that". These include verbs ("She TOLD him THAT..."), nouns ("The IDEA THAT..."), and adjectives ("We are CERTAIN THAT").

TI is used for titles like "president", which can be used in certain cirumstances without a determiner: "AS PRESIDENT of the company, it is my decision".

TM is used to connect month names to day numbers: "It happened on JANUARY 21".

TO connects verbs and adjectives which take infinitival complements to the word "to": "We TRIED TO start the car", "We are EAGER TO do it".

TQ is the determiner connector for time expressions acting as fronted objects: "How MANY YEARS did it last".

TS connects certain verbs that can take subjunctive clauses as complements - "suggest", "require" - to the word that: "We SUGGESTED THAT he go".

TW connects days of the week to dates in time expressions: "The meeting will be on MONDAY, JANUARY 21".

TY is used for certain idiomatic usages of year numbers: "I saw him on January 21 , 1990 ". (In this case it connects the day number to the year number.)

U is a special connector on nouns, which is disjoined with both the determiner and subject-object connectors. It is used in idiomatic expressions like "What KIND_OF DOG did you buy?"

UN connects the words "until" and "since" to certain time phrases like "after [clause]": "You should wait UNTIL AFTER you talk to me".

V connects various verbs to idiomatic expressions that may be non-adjacent: "We TOOK him FOR_GRANTED", "We HELD her RESPONSIBLE".

W connects the subjects of main clauses to the wall, in ordinary declaratives, imperatives, and most questions (except yes-no questions). It also connects coordinating conjunctions to following clauses: "We left BUT SHE stayed".

WN connects the word "when" to time nouns like "year": "The YEAR WHEN we lived in England was wonderful".

WR connects the word "where" to a few verbs like "put" in questions like "WHERE did you PUT it?".

X is used with punctuation, to connect punctuation symbols either to words or to each other. For example, in this case, POODLE connects to commas on either side: "The dog , a POODLE , was black."

Y is used in certain idiomatic time and place expressions, to connect quantity expressions to the head word of the expression: "He left three HOURS AGO", "She lives three MILES FROM the station".

YP connects plural noun forms ending in s to "'" in possessive constructions: "The STUDENTS ' rooms are large".

YS connects nouns to the possessive suffix "'s": "JOHN 'S dog is black".

Z connects the preposition "as" to certain verbs: "AS we EXPECTED, he was late".

 

External Links


Ce document a été publié le 2019-06-26 02:49:44. (Dernière mise à jour : 2019-07-18 14:28:48.)




This website uses 'cookies' to enhance user experience and provide authentification. You may change which cookies are set at any time by clicking on more info. Accept
x