Difference between revisions of "TreeTagger"

From Christoph's Personal Wiki
Jump to: navigation, search
(See also)
(Part-of-speech tags used)
Line 13: Line 13:
 
             finished.
 
             finished.
  
==Part-of-speech tags used==
+
==Part-of-speech tags (POST) used==
  
.      sentence closer (. ; ? *)
+
*.      sentence closer (. ; ? *)
 
*(    left paren
 
*(    left paren
 
*)    right paren
 
*)    right paren
Line 96: Line 96:
 
*WQL  wh- qualifier (how)
 
*WQL  wh- qualifier (how)
 
*WRB  wh- adverb (how, where, when)
 
*WRB  wh- adverb (how, where, when)
 +
 +
===Modified POST===
 +
<div style="float:left; margin:0px 20px 20px 0px;">
 +
{| align="center" style="border: 1px solid #999; background-color:#FFFFFF"
 +
|-
 +
|-align="center" bgcolor="#1188ee"
 +
!Tag
 +
!Description
 +
!Examples
 +
|-
 +
|CC || Conjunction; coordinating || and, or
 +
|--bgcolor="#eeeeee"
 +
|CD || Adjective; cardinal number || 3, fifteen
 +
|-
 +
|DET || Determiner || this, each, some
 +
|--bgcolor="#eeeeee"
 +
|EX || Pronoun, existential there || there
 +
|-
 +
|FW || Foreign words ||
 +
|--bgcolor="#eeeeee"
 +
|IN || Preposition / Conjunction || for, of, although, that
 +
|-
 +
|JJ || Adjective || happy, bad
 +
|--bgcolor="#eeeeee"
 +
|JJR || Adjective; comparative || happier, worse
 +
|-
 +
|JJS || Adjective; superlative || happiest, worst
 +
|--bgcolor="#eeeeee"
 +
|LS || Symbol, list item || A, A.
 +
|-
 +
|MD || Verb; modal || can, could, 'll
 +
|--bgcolor="#eeeeee"
 +
|NN || Noun || aircraft, data
 +
|-
 +
|NNP || Noun; proper || London, Michael
 +
|--bgcolor="#eeeeee"
 +
|NNPS || Noun, proper, plural || Australians, Methodists
 +
|-
 +
|NNS || Noun; plural || women, books
 +
|--bgcolor="#eeeeee"
 +
|PDT || Determiner; prequalifier || quite, all, half
 +
|-
 +
|POS || Possessive || 's, '
 +
|--bgcolor="#eeeeee"
 +
|PRP || Determiner; possessive second || mine, yours
 +
|-
 +
|PRPS || Determiner; possessive || their, your
 +
|--bgcolor="#eeeeee"
 +
|RB || Adverb || often, not, very, here
 +
|-
 +
|RBR || Adverb; comparative || faster
 +
|--bgcolor="#eeeeee"
 +
|RBS || Adverb; superlative || fastest
 +
|-
 +
|RP || Adverb; particle || up, off, out
 +
|--bgcolor="#eeeeee"
 +
|SYM || Symbol || *
 +
|-
 +
|TO || Preposition || to
 +
|--bgcolor="#eeeeee"
 +
|UH || Interjection || oh, yes, mmm
 +
|-
 +
|VB || Verb; infinitive || take, live
 +
|--bgcolor="#eeeeee"
 +
|VBD || Verb; past tense || took, lived
 +
|-
 +
|VBG || Verb; gerund || taking, living
 +
|--bgcolor="#eeeeee"
 +
|VBN || Verb; past/passive participle || taken, lived
 +
|-
 +
|VBP || Verb; base present form || take, live
 +
|--bgcolor="#eeeeee"
 +
|VBZ || Verb; present 3SG -s form || takes, lives
 +
|-
 +
|WDT || Determiner; question || which, whatever
 +
|--bgcolor="#eeeeee"
 +
|WP || Pronoun; question || who, whoever
 +
|-
 +
|WPS || Determiner; possessive and question || whose
 +
|--bgcolor="#eeeeee"
 +
|WRB || Adverb; question || when, how, however
 +
|-
 +
! colspan="4" bgcolor="#fff" | '''Punctuation'''
 +
|--bgcolor="#eeeeee"
 +
|PP || Punctuation; sentence ender || ., !, ?
 +
|-
 +
|PPC || Punctuation; comma || ,
 +
|--bgcolor="#eeeeee"
 +
|PPD || Punctuation; dollar sign || $
 +
|-
 +
|PPL || Punctuation; quotation mark left || ``
 +
|--bgcolor="#eeeeee"
 +
|PPR || Punctuation; quotation mark right || <nowiki>''</nowiki>
 +
|-
 +
|PPS || Punctuation; colon, semicolon, elipsis || :, ..., -
 +
|--bgcolor="#eeeeee"
 +
|LRB || Punctuation; left bracket || (, {, [
 +
|-
 +
|RRB || Punctuation; right bracket || ), }, ]
 +
|}
  
 
==See also==
 
==See also==

Revision as of 04:02, 28 May 2007

The TreeTagger is a tool for annotating text with part-of-speech and lemma information which has been developed within the TC project at the Institute for Computational Linguistics of the University of Stuttgart. The TreeTagger has been successfully used to tag German, English, French, Italian, Dutch, Spanish, Bulgarian, Russian, Greek, Portuguese, and old French texts and is easily adaptable to other languages if a lexicon and a manually tagged training corpus are available.

Example Usage

% echo 'The three big red dogs.' | cmd/tree-tagger-english
          reading parameters ...
          tagging ...
  The     DT      the
  three   CD      three
  big     JJ      big
  red     JJ      red
  dogs    NNS     dog
  .       SENT    .
           finished.

Part-of-speech tags (POST) used

  • . sentence closer (. ; ? *)
  • ( left paren
  • ) right paren
  • * not, n't
  • -- dash
  • , comma
  •  : colon
  • ABL pre-qualifier (quite, rather)
  • ABN pre-quantifier (half, all)
  • ABX pre-quantifier (both)
  • AP post-determiner (many, several, next )
  • AT article (a, the, no)
  • BE be
  • BED were
  • BEDZ was
  • BEG being
  • BEM am
  • BEN been
  • BER are, art
  • BEZ is
  • CC coordinating conjunction (and, or)
  • CD cardinal numberal (one, two, 2, etc.)
  • CS subordinating conjunction (if, although)
  • DO do
  • DOD did
  • DOZ does
  • DT singular determiner/quantifier (this, that)
  • DTI singular or plural determiner/quantifier (some, any)
  • DTS plural determiner (these, those)
  • DTX determiner/double conjunction (either)
  • EX existential there
  • FW foreign word (hypenated before regular tag)
  • HV have
  • HVD had (past tense)
  • HVG having
  • HVN had (past participle)
  • IN preposition
  • JJ adjective
  • JJR comparative adjective
  • JJS semantically superlative adjective (chief,top)
  • JJT morphologically superlative adjective (biggest)
  • MD modal auxiliary (can, should, will)
  • NC cited word (hyphenated after regular tag)
  • NN singular or mass noun
  • NN$ possessive singular noun
  • NNS plural noun
  • NNS$ possessive plural noun
  • NP proper noun or part of name phrase
  • NP$ possessive proper noun
  • NPS$ possessive plural proper noun
  • NR adverbial noun (home, today, west)
  • OD ordinal numeral (first, 2nd)
  • PN nominal pronoun (everybody, nothing)
  • PN$ possessive nominal pronoun
  • PP$ possessive personal pronoun (my, our)
  • PP$$ second (nominal) possessive prounon (mine, ours)
  • PPL singular reflexive/intensive personal pronoun (myself)
  • PPLS plural reflexive/intensive personal pronoun (ourselves)
  • PPO objective personal pronoun (me, him, it, them)
  • PPS 3rd. singular nominative pronoun (he, she, it, one)
  • PPSS other nominative personal pronoun (I, we, they, you)
  • QL qualifier (very, fairly)
  • QLP post-qualifer (enough, indeed)
  • RB adverb
  • RBR comparative adverb
  • RBT superlative adverb
  • RN nominal adverb (here, then, indoors)
  • RP adverb/particle (about, off, up)
  • TO infinitive marker to
  • UH interjection, exclamation
  • VB verb, base form
  • VBD verb, past tense
  • VBG verb, present participle/gerund
  • VBN verb, past participle
  • VBZ verb, 3rd. singular present
  • WDT wh- determiner (what, which)
  • WP$ possessive wh- pronoun (whose)
  • WPO objective wh- pronoun (whom, which, that)
  • WPS nominative wh- pronoun (who, which, that)
  • WQL wh- qualifier (how)
  • WRB wh- adverb (how, where, when)

Modified POST

Tag Description Examples
CC Conjunction; coordinating and, or
CD Adjective; cardinal number 3, fifteen
DET Determiner this, each, some
EX Pronoun, existential there there
FW Foreign words
IN Preposition / Conjunction for, of, although, that
JJ Adjective happy, bad
JJR Adjective; comparative happier, worse
JJS Adjective; superlative happiest, worst
LS Symbol, list item A, A.
MD Verb; modal can, could, 'll
NN Noun aircraft, data
NNP Noun; proper London, Michael
NNPS Noun, proper, plural Australians, Methodists
NNS Noun; plural women, books
PDT Determiner; prequalifier quite, all, half
POS Possessive 's, '
PRP Determiner; possessive second mine, yours
PRPS Determiner; possessive their, your
RB Adverb often, not, very, here
RBR Adverb; comparative faster
RBS Adverb; superlative fastest
RP Adverb; particle up, off, out
SYM Symbol *
TO Preposition to
UH Interjection oh, yes, mmm
VB Verb; infinitive take, live
VBD Verb; past tense took, lived
VBG Verb; gerund taking, living
VBN Verb; past/passive participle taken, lived
VBP Verb; base present form take, live
VBZ Verb; present 3SG -s form takes, lives
WDT Determiner; question which, whatever
WP Pronoun; question who, whoever
WPS Determiner; possessive and question whose
WRB Adverb; question when, how, however
Punctuation
PP Punctuation; sentence ender ., !, ?
PPC Punctuation; comma ,
PPD Punctuation; dollar sign $
PPL Punctuation; quotation mark left ``
PPR Punctuation; quotation mark right ''
PPS Punctuation; colon, semicolon, elipsis  :, ..., -
LRB Punctuation; left bracket (, {, [
RRB Punctuation; right bracket ), }, ]

See also

External links