The German chunker was trained on the Negra corpus and uses the
following chunk labels:
NC for noun chunks
PC for prepositional phrase chunks (NC + preposition/postposition)
VC verb chunk
Verb (complexes) in verb-first, verb-second, and verb-final position
are all marked as verb chunks. If the "Mittelfeld" of the sentence is
empty, as in the following example, the chunker nevertheless prints
two verb chunks.
Er PPER er
# finite verb in verb-second position
soll VMFIN sollen
# verb complex in verb-final position
geraucht VVPP rauchen
haben VAINF haben
wollen VMFIN wollen
. $. .
A noun chunk is a non-recursive noun phrase. Recursive noun phrases
such as "die Königin von Schweden" are analysed as an NC plus a PC.
A noun phrase with an embedded noun phrase in pre-head position is
split into two NCs plus the embedded NC(s) and/or PC(s) as the folling
example shows:
# first part of the matrix NP
Das ART d
# embedded NP
sich PRF er|es|sie
# embedded PP
in APPR in
die ART d
Länge NN Länge
# second part of the matrix NP
ziehende ADJA ziehend
Treffen NN Treffen