Skip to main content

module analysis::text::search::Grammars

rascal-0.41.2
org.rascalmpl.rascal-lucene-0.1.0

Bridges Rascal grammars and parser generation to the Lucene "Analyzer" and "Tokenizer" interfaces.

Usage

import analysis::text::search::Grammars;

Dependencies

extend analysis::text::search::Lucene;
import ParseTree;
import String;

Description

By leveraging the information in ParseTree instances we can provide, selectively, tokens for any source file that we have a grammar for:

This functionality is based on the Lucene module, and its underlying adapter that bridges Rascal callbacks to Lucene's search framework.

function analyzerFromGrammar

Analyzer analyzerFromGrammar(type[&T <: Tree] grammar)

function identifierAnalyzerFromGrammar

Analyzer identifierAnalyzerFromGrammar(type[&T <: Tree] grammar)

function commentAnalyzerFromGrammar

Analyzer commentAnalyzerFromGrammar(type[&T <: Tree] grammar)

function tokenizerFromGrammar

Use a generate parser as a Lucene tokenizer. Skipping nothing.

Tokenizer tokenizerFromGrammar(type[&T <: Tree] grammar)

function identifierTokenizerFromGrammar

Use a generated parser as a Lucene tokenizer, and skip all keywords and punctuation.

Tokenizer identifierTokenizerFromGrammar(type[&T <: Tree] grammar)

function commentTokenizerFromGrammar

Use a generated parser as a Lucene tokenizer, and skip all keywords and punctuation.

Tokenizer commentTokenizerFromGrammar(type[&T <: Tree] grammar)

function tokens

list[Tree] tokens(amb({Tree x, *_}), bool(Tree) isTokenPredicate)

default list[Tree] tokens(Tree tok, bool(Tree) isTokenPredicate)

function isTokenType

bool isTokenType(lit(_))

bool isTokenType(cilit(_))

bool isTokenType(lex(_))

bool isTokenType(layouts(_))

bool isTokenType(label(str _, Symbol s))

default bool isTokenType(Symbol _)

function isToken

bool isToken(appl(prod(Symbol s, _, _), _))

bool isToken(char(_))

default bool isToken(Tree _)

function isLexical

bool isLexical(appl(prod(Symbol s, _, _), _))

default bool isLexical(Tree _)

function isComment

bool isComment(Tree t)

default bool isComment(Tree _)