Skip to main content

module lang::json::IO

rascal-Not specified

Serialization of Rascal values to JSON format and deserialization back from JSON format to Rascal values.

Usage

import lang::json::IO;

Dependencies

import util::Maybe;
import Exception;

Description

The pairs As J S ON:Parse J S ON and Write J S ON:Read J S ON are both bi-directional transformations between serializable Rascal values (all except function instances) and JSON strings. The As J S ON and Parse J S ON work on str representations, while Write J S ON and Read J S ON stream to/from files directly.

The basic principle of the bi-directional mapping is that constructors of algebraic data-types map one-to-one to JSON object notation, and vice versa. The other builtin Rascal data-structures are judiciously mapped to objects and arrays, and strings, etc. The goal is that their representation is natural on the receiving end (e.g. TypeScript, Javascript and Python code), without sacrificing on the naturalness of the Rascal representation.

Examples

This example demonstrates serializing:

  • constructors without parameters as "enums"
  • constructors with both positional fields and keyword fields
  • datetime serialization
  • integer serialization
rascal>import lang::json::IO;
ok
rascal>data Size = xxs() | xs() | s() | m() | l() | xl() | xxl();
ok
rascal>data Person = person(str firstName, str lastName, datetime birth=$1977-05-17T06:00:00.000+00:00$, int height=0, Size size = m());
ok
rascal>example = person("Santa", "Class", height=175, size=xxl());
Person: person(
"Santa",
"Class",
size=xxl(),
height=175)
rascal>jsonExample = asJSON(example, dateTimeFormat="YYYY-MM-DD");
str: "{\"firstName\":\"Santa\",\"lastName\":\"Class\",\"size\":\"xxl\",\"height\":175}"
───
{"firstName":"Santa","lastName":"Class","size":"xxl","height":175}
───

On the way back we can also track origins for constructors:

rascal>parseJSON(#Person, jsonExample, trackOrigins=true)
Person: person(
"Santa",
"Class",
size=xxl(),
src=|unknown:///|(0,66,<1,0>,<1,66>),
height=175)

Benefits

  • Using the `expected`` type arguments of Parse J S ON and Read J S ON the homonyms created by As J S ON and Write J S ON can be converted back to their original Rascal structures. If the expected type contains only concrete types, and no abstract types then then pairs As J S ON/Parse J S ON and Write J S ON/Read J S ON are isomorphic.
    • The abstract types are value, node, num or any composite type that contains it. Maybe[value] is an example of an abstract type.
    • The concrete types are all types which are not abstract types. Maybe[int] is an example of a concrete type.
    • Run-time values always have concrete types, while variables in code often have abstract types.
  • If you provide value or node as an expected type, you will always get a useful representation on the Rascal side. It is not guaranteed to be the same representation as before.
  • Read J S ON and Parse J S ON can read numbers beyond the bounds of normal int, long and doubles. As long as the number is syntactically correct, they will bind to the right Rascal number representation.

Pitfalls

  • As J S ON and Write J S ON are not isomorphisms. They are homomorphisms that choose JSON arrays or JSON objects for multiple different kinds of Rascal values. For example maps and nodes and ADT's are all mapped to JSON object notation (homonyms).
  • JSON is a serialization format that does not deal with programming language numerical encodings such as double, float`` or long`. Write J S ON and As J S ON might write numbers beyond the limits and beyond the accuracy of what the other programming language can deal with. You can expect errors at the time when the other language (Javascript, Python, TypeScript) reads these numbers from JSON.

data RuntimeException

JSON parse errors have more information than general parse errors

data RuntimeException (str cause="", str path="")
  • location is the place where the parsing got stuck (going from left to right).
  • cause is a factual diagnosis of what was expected at that position, versus what was found.
  • path is a path query string into the JSON value from the root down to the leaf where the error was detected.

function readJSON

reads JSON values from a stream

&T readJSON(
type[&T] expected,
loc src,
str dateTimeFormat = DEFAULT_DATETIME_FORMAT,
bool lenient=false,
bool trackOrigins=false,
JSONParser[value] parser = (type[value] _, str _) { throw ""; },
map [type[value] forType, value nullValue] nulls = defaultJSONNULLValues,
bool explicitConstructorNames = false,
bool explicitDataTypes = false
)

In general the translation behaves as follows:

  • Objects translate to map[str,value] by default, unless a node is expected (properties are then translated to keyword fields)
  • Arrays translate to lists by default, or to a set if that is expected or a tuple if that is expected. Arrays may also be interpreted as constructors or nodes (see below)
  • Booleans translate to bools
  • If the expected type provided is a datetime then an int instant is mapped and if a string is found then the dateTimeFormat parameter will be used to configure the parsing of a date-time string
  • If the expected type provided is an ADT then this reader will try to "parse" each object as a constructor for that ADT. It helps if there is only one constructor for that ADT. Positional parameters will be mapped by name as well as keyword parameters.
  • If the expected type provided is a node then it will construct a node named "object" and map the fields to keyword fields.
  • If num, int, real or rat are expected both strings and number values are mapped
  • If loc is expected than strings which look like URI are parsed (containing :/) or a file:/// URI is build, or if an object is found each separate field of a location object is read from the respective properties: { scheme : str, authority: str?, path: str?, fragment: str?, query: str?, offset: int, length: int, begin: [bl, bc], end: [el, ec]}
  • Go to JSONParser to find out how to use the optional parsers parameter.
  • if the parser finds a null JSON value, it will lookup in the nulls map based on the currently expected type which value to return, or throw an exception otherwise. First the expected type is used as a literal lookup, and then each value is tested if the current type is a subtype of it.

function parseJSON

parses JSON values from a string. In general the translation behaves as the same as for ((readJSON)).

&T parseJSON(
type[&T] expected,
str src,
str dateTimeFormat = DEFAULT_DATETIME_FORMAT,
bool lenient=false,
bool trackOrigins=false,
JSONParser[value] parser = (type[value] _, str _) { throw ""; },
map[type[value] forType, value nullValue] nulls = defaultJSONNULLValues,
bool explicitConstructorNames = false,
bool explicitDataTypes = false
)

function writeJSON

Serializes a value as a JSON string and stream it

void writeJSON(loc target, value val, 
bool unpackedLocations=false,
str dateTimeFormat=DEFAULT_DATETIME_FORMAT,
bool dateTimeAsInt=false,
int indent=0,
bool dropOrigins=true,
JSONFormatter[value] formatter = str (value _) { fail; },
bool explicitConstructorNames=false,
bool explicitDataTypes=false
)

This function tries to map Rascal values to JSON values in a natural way. In particular it tries to create a value that has the same number of recursive levels, such that one constructor maps to one object. The serialization is typically lossy since JSON values by default do not explicitly encode the class or constructor while Rascal data types do.

If you need the names of constructors or data-types in your result, then use the parameters:

  • explicitConstructorNames=true will store the name of every constructor in a field _constructor
  • explicitDataTypes=true will store the name of the ADT in a field called _type
  • Check out JSONFormatter on how to use the formatters parameter
  • The dateTimeFormat parameter dictates how datetime values will be printed.
  • The unpackedLocations parameter will produce an object with many fields for every property of a loc value, but if set to false a loc will be printed as a string.

Pitfalls

  • It is understood that Rascal's number types have arbitrary precision, but this is not supported by the JSON writer. As such when an int is printed that does not fit into a JVM long, there will be truncation to the lower 64 bits. For real numbers that are larger than JVM's double you get "negative infinity" or "positive infinity" as a result.

function asJSON

Serializes a value as a JSON string and stores it as a string

Does what ((writeJSON)) does but serializes to a string instead of a location target.

str asJSON(value val, bool unpackedLocations=false, str dateTimeFormat=DEFAULT_DATETIME_FORMAT, bool dateTimeAsInt=false, int indent = 0, bool dropOrigins=true, JSONFormatter[value] formatter = str (value _) { fail; }, bool explicitConstructorNames=false, bool explicitDataTypes=false)

This function uses writeJSON and stores the result in a string.

alias JSONFormatter

((writeJSON)) and ((asJSON)) uses Formatter functions to flatten structured data to strings, on-demand

str (&T)

A JSONFormatter can be passed to the Write J S ON and As J S ON functions. When/if the type matches an algebraic data-type to be serialized, then it is applied and the resulting string is serialized to the JSON stream instead of the structured data.

The goal of JSONFormat and its dual JSONParser is to bridge the gap between string-based JSON encodings and typical Rascal algebraic combinators.

alias JSONParser

((readJSON)) and ((parseJSON)) use JSONParser functions to turn unstructured data into structured data.

&T (type[&T], str)

A parser JSONParser can be passed to Read J S ON and Parse J S ON. When the reader expects an algebraic data-type or a syntax type, but the input at that moment is a JSON string, then the parser is called on that string (after string.trim()).

The resulting data constructor is put into the resulting value instead of a normal string.

The goal of JSONParser and its dual JSONFormatter is to bridge the gap between string-based JSON encodings and typical Rascal algebraic combinators.

Benefits

  • Use parsers to create more structure than JSON provides.

Pitfalls

  • The type[&T] argument is called dynamically by the JSON reader; it does not contain the grammar. It does encode the expected type of the parse result.
  • The expected types can only be data types, not syntax types.