Skip to main content

module analysis::text::stemming::Snowball

rascal-0.34.0

Provides the library of stemmers written in the Snowball languages, and compiled to Java, which are distributed with Lucene as a Rascal module.

Usage

import analysis::text::stemming::Snowball;

See the Snowball homepage for more informations

Examples

rascal>import analysis::text::stemming::Snowball;
ok
rascal>stem("bikes")
str: "bike"
---
bike
---

data Language

data Language  
= armenian()
| basque()
| catalan()
| danish()
| dutch()
| english()
| finnish()
| french()
| german()
| german2()
| hungarian()
| irish()
| italian()
| lithuanian()
| norwegian()
| portugese()
| romanian()
| russian()
| spanish()
| swedish()
| turkish()
;

function stem

Stemming algorithms from the Tartarus Snowball the Snowball homepage for different languages.

str stem(str word, Language lang=english())

This library wrapped into a single function supports Armenian, Basque, Catalan, Danish, Dutch, English, Finnish, French, German, Hungarian, Irish, Italian, Lithuanian, Norwegian, Portugese, Romanian, Russian, Spanish, Swedish and Turkish.

function kraaijPohlmannStemmer

Kraaij-Pohlmann is a well-known stemmer for the Dutch language.

str kraaijPohlmannStemmer(str word)

See http://snowball.tartarus.org/algorithms/kraaij_pohlmann/stemmer.html

function porterStemmer

Porter stemming is a "standard" stemming algorithm for English of sorts.

str porterStemmer(str word)

See http://snowball.tartarus.org/algorithms/porter/stemmer.html for more information.

function lovinsStemmer

Lovins designed the first stemmer according to the Tartarus website.

str lovinsStemmer(str word)

See http://snowball.tartarus.org/algorithms/lovins/stemmer.html