Skip to main content

module Type

rascal-Not specified

Rascal's type system, implemented in Rascal itself.

Usage

import Type;

The goal of this module is to provide:

  • reflection capabilities that are useful for deserialization and validation of data, and
  • to provide the basic building blocks for syntax trees (see Parse Tree),
  • provide the implementations of Rascal's type lattice: Subtype, Glb, Lub and Intersects for reified type Symbols

The following definition is built into Rascal:

data type[&T] = type(Symbol symbol, map[Symbol, Production] definitions);

For values of type type[...] the static and dynamic type systems satisfy three additional constraints over the rules of type-parameterized data types:

  1. For any type T: #T has type type[T]
  2. For any type T and any value of type[T], namely type(S, D) it holds that S is the symbolic representation of type T using the Symbol type, and
  3. ... D holds all the necessary data and syntax rules required to form values of type T.

In other words, the # operator will always produce a value of type[&T], where &T is bound to the type that was reified and said value will contain the full grammatical definition for what was bound to &T.

The Subtype relation of Rascal has all the mathematical properties of a finite lattice; where Lub implements the join and Glb implements the meet operation. This is a core design principle of Rascal with the following benefits:

  • Type inference has a guaranteed least or greatest solution, always. This means that constraints are always solvable in an unambigous manner.
  • A principal type can always be computed, which is a most precise and unique solution of a type inference problem. Without the lattice, solution candidates could become incomparable and thus ambiguous. Without this principal type property, type inference is predictable for programmers.
  • Solving type inference constraints can be implemented efficiently. The algorithm, based on Lub and Glb, makes progress deterministically and does not require backtracking to find better solutions. Since the lattice is not very deep, fixed-point solutions are always found quickly.

Much of the aspects of the Subtype lattice are derived from the fact that Rascal's values are immutable or readonly. This typically allows for containers to be co-variant in their type parameters: list[int] <: list[value]. Because all values in Rascal are immutable, and implemented using persistent data-structures, its type literals do not feature annotations for co- and contra-variance. We can assume co-variance practically everywhere (even for function parameters).

Function types in Rascal are special because functions are always openly extensible, as are the modules they are contained in. This means that the parameter types of functions can also be extended, and thus they are co-variant (accepting more rather than fewer kinds of data). Contra-variance is still also allowed, of course, and so we say function parameter types are "variant" in both directions. Lub has been hard-wired to choose the more general (co-variant) solutions for function parameter types to reflect the general applicability of the openly extensible functions.

Examples

rascal>import Type;
ok
rascal>#int
type[int]: type(
int(),
())
rascal>#rel[int,int]
type[rel[int,int]]: type(
set(tuple([
int(),
int()
])),
())
rascal>data B = t();
ok
rascal>#B
type[B]: type(
adt(
"B",
[]),
(adt(
"B",
[]):choice(
adt(
"B",
[]),
{cons(
label(
"t",
adt(
"B",
[])),
[],
[],
{})})))
rascal>syntax A = "a";
ok
rascal>#A;
type[A]: type(
sort("A"),
(
layouts("$default$"):choice(
layouts("$default$"),
{prod(
layouts("$default$"),
[],
{})}),
empty():choice(
empty(),
{prod(
empty(),
[],
{})}),
sort("A"):choice(
sort("A"),
{prod(
sort("A"),
[lit("a")],
{})})
))
rascal>type(\int(),())
type[value]: type(
int(),
())

data Symbol

A Symbol represents a Rascal Type.

data Symbol  
= \int()
| \bool()
| \real()
| \rat()
| \str()
| \num()
| \node()
| \void()
| \value()
| \loc()
| \datetime()
;

Symbols are values that represent Rascal's types. These are the atomic types. We define here:

  • ❶ Atomic types.
  • ❷ Labels that are used to give names to symbols, such as field names, constructor names, etc.
  • ❸ Composite types.
  • ❹ Parameters that represent a type variable.

In Parse Tree, see Symbol, Symbols will be further extended with the symbols that may occur in a parse tree.

data Symbol

data Symbol  
= \label(str name, Symbol symbol)
;

data Symbol

data Symbol  
= \set(Symbol symbol)
| \tuple(list[Symbol] symbols)
| \list(Symbol symbol)
| \map(Symbol from, Symbol to)
| \adt(str name, list[Symbol] parameters)
| \cons(Symbol \adt, str name, list[Symbol] parameters)
| \alias(str name, list[Symbol] parameters, Symbol aliased)
| \func(Symbol ret, list[Symbol] parameters, list[Symbol] kwTypes)
| \reified(Symbol symbol)
;

data Symbol

data Symbol  
= \parameter(str name, Symbol bound)
;

function \rel

rel types are syntactic sugar for sets of tuples.

Symbol \rel(list[Symbol] symbols)

function \lrel

lrel types are syntactic sugar for lists of tuples.

Symbol \lrel(list[Symbol] symbols)

function overloaded

Overloaded/union types are always reduced to the least upperbound of their constituents.

Symbol overloaded(set[Symbol] alternatives)

This semantics of overloading in the type system is essential to make sure it remains a lattice.

data Production

A production in a grammar or constructor in a data type.

data Production  
= \cons(Symbol def, list[Symbol] symbols, list[Symbol] kwTypes, set[Attr] attributes)
| \choice(Symbol def, set[Production] alternatives)
| \composition(Production lhs, Production rhs)
;

Productions represent abstract (recursive) definitions of abstract data type constructors and functions:

  • cons: a constructor for an abstract data type.
  • func: a function.
  • choice: the choice between various alternatives.
  • composition: composition of two productions.

In ParseTree, see Production, Productions will be further extended and will be used to represent productions in syntax rules.

data Attr

Attributes register additional semantics annotations of a definition.

data Attr  
= \tag(value \tag)
;

function \var-func

Transform a function with varargs (...) to a normal function with a list argument.

Symbol \var-func(Symbol ret, list[Symbol] parameters, Symbol varArg)

function choice

Normalize the choice between alternative productions.

Production choice(Symbol s, set[Production] choices)

The following normalization rules canonicalize grammars to prevent arbitrary case distinctions later Nested choice is flattened.

The below code replaces the following code for performance reasons in compiled code:

Production choice(Symbol s, {*Production a, choice(Symbol t, set[Production] b)})
= choice(s, a+b);

function subtype

Subtype is the core implementation of Rascal's type lattice.

bool subtype(type[&T] t, type[&U] u)

The following graph depicts Rascal's type lattice for a number of example types, including:

  • all the builtin types, with value as maximum and void as the minimum of the lattice
  • parameterized container types (co-variant)
  • a function type example (variant in parameter positions, both directions)
  • a data type Exp
  • a reified type type[value] and type[int] (co-variant) image

Examples

rascal>import Type;
ok
rascal>subtype(#int, #value)
bool: true
rascal>subtype(#value, #int)
bool: false

function subtype

This function documents and implements the subtype relation of Rascal's type system.

bool subtype(Symbol s, Symbol t)

function intersects

Checks if two types have a non-empty intersection, meaning there exists values which types with both as a supertype

bool intersects(Symbol s, Symbol t)

bool intersects(type[&T] t, type[&U] u)

Consider tuple[int, value] and tuple[value, int] which are not subtypes of one another, but all the values in their intersection, tuple[int, int] belong to both types.

Type intersection is important for the type-checking of pattern matching, and since function parameters are patterns in Rascal, also for the type-checking of function parameters. Pattern matching between two types is possible as long as the intersection is non-empty. This is true if one is a sub-type of other, or vice verse: then the intersection is the subtype. However, the above tuple example also shows there can be non-empty intersections when the types are not sub-types.

function comparable

Check if two types are comparable, i.e., one is a subtype of the other or vice versa.

bool comparable(Symbol s, Symbol t)

bool comparable(type[value] s, type[value] t)

function equivalent

Check if two types are equivalent, i.e. they are both subtypes of each other.

bool equivalent(Symbol s, Symbol t)

bool equivalent(type[value] s, type[value] t)

function eq

Strict structural equality between values.

bool eq(value x, value y)

The difference between eq and == is that no implicit coercions are done between values of incomparable types at the top-level.

The == operator, for convience, equates 1.0 with 1 but not [1] with [1.0], which can be annoying when writing consistent specifications. The new number system that is coming up will not have these issues.

Examples

rascal>import Type;
ok
rascal>1 == 1.0
bool: true
rascal>eq(1,1.0)
bool: false

function lub

The least-upperbound (lub) of two types is the common ancestor in the type lattice that is lowest.

Symbol lub(Symbol s1, Symbol s2)

type[value] lub(type[&T] t, type[&T] u)

This function implements the lub operation in Rascal's type system, via unreifying the Symbol values and calling into the underlying run-time Type implementation.

Examples

rascal>import Type;
ok
rascal>lub(#tuple[int,value], #tuple[value, int])
type[value]: type(
tuple([
value(),
value()
]),
())
rascal>lub(#int, #real)
type[value]: type(
num(),
())

function glb

The greatest lower bound (glb) between two types, i.e. a common descendant of two types in the lattice which is largest.

Symbol glb(Symbol s1, Symbol s2)

type[value] glb(type[&T] t, type[&T] u)

This function implements the glb operation in Rascal's type system, via unreifying the Symbol values and calling into the underlying run-time Type implementation.

Examples

rascal>import Type;
ok
rascal>glb(#tuple[int,value], #tuple[value, int])
type[value]: type(
tuple([
int(),
int()
]),
())
rascal>glb(#int, #real)
type[value]: type(
void(),
())

data Exception

data Exception  
= typeCastException(Symbol from, type[value] to)
;

function typeCast

&T typeCast(type[&T] typ, value v)

function make

Dynamically instantiate an data constructor of a given type with the given children and optional keyword arguments.

&T make(type[&T] typ, str name, list[value] args)

&T make(type[&T] typ, str name, list[value] args, map[str,value] keywordArgs)

This function will build a constructor if the definition exists and the parameters fit its description, or throw an exception otherwise.

This function can be used to validate external data sources against a data type such as XML, JSON and YAML documents.

function typeOf

Returns the dynamic type of a value as a ((Type-Symbol)).

Symbol typeOf(value v)

As opposed to the # operator, which produces the type of a value statically, this function produces the dynamic type of a value, represented by a symbol.

Examples

rascal>import Type;
ok
rascal>value x = 1;
value: 1
rascal>typeOf(x)
Symbol: int()

Benefits

  • constructing a reified type from a dynamic type is possible:
rascal>type(typeOf(x), ())
type[value]: type(
int(),
())

Pitfalls

Note that the typeOf function does not produce definitions, like the reify operator # does, since values may escape the scope in which they've been constructed.