Search in: Word
Vietnamese keyboard: Off
Virtual keyboard: Show
Computing (FOLDOC) dictionary (also found in English - Vietnamese, English - English (Wordnet), )
Jump to user comments
information science, human language A program or algorithm
which determines the morphological root of a given inflected
(or, sometimes, derived) word form -- generally a written word
A stemmer for English, for example, should identify the
string "cats" (and possibly "catlike", "catty" etc.) as
based on the root "cat", and "stemmer", "stemming", "stemmed"
as based on "stem".
English stemmers are fairly trivial (with only occasional
problems, such as "dries" being the third-person singular
present form of the verb "dry", "axes" being the plural of
"ax" as well as "axis"); but stemmers become harder to design
as the morphology, orthography, and character encoding of
the target language becomes more complex. For example, an
Italian stemmer is more complex than an English one (because
of more possible verb inflections), a Russian one is more
complex (more possible noun declensions), a Hebrew one is even
more complex (a hairy writing system), and so on.
Stemmers are common elements in query systems, since a user
who runs a query on "daffodils" probably cares about documents
that contain the word "daffodil" (without the s).