module lang::json::IO
Serialization of Rascal values to JSON format and deserialization back from JSON format to Rascal values.
Usage
import lang::json::IO;
Dependencies
import util::Maybe;
import Exception;
Description
The pairs As J S ON:Parse J S ON and Write J S ON:Read J S ON are both bi-directional
transformations between serializable Rascal values (all except function instances) and JSON strings.
The As J S ON and Parse J S ON work on str representations, while Write J S ON and Read J S ON
stream to/from files directly.
The basic principle of the bi-directional mapping is that constructors of algebraic data-types map one-to-one to JSON object notation, and vice versa. The other builtin Rascal data-structures are judiciously mapped to objects and arrays, and strings, etc. The goal is that their representation is natural on the receiving end (e.g. TypeScript, Javascript and Python code), without sacrificing on the naturalness of the Rascal representation.
Examples
This example demonstrates serializing:
- constructors without parameters as "enums"
- constructors with both positional fields and keyword fields
- datetime serialization
- integer serialization
rascal>import lang::json::IO;
ok
rascal>data Size = xxs() | xs() | s() | m() | l() | xl() | xxl();
ok
rascal>data Person = person(str firstName, str lastName, datetime birth=$1977-05-17T06:00:00.000+00:00$, int height=0, Size size = m());
ok
rascal>example = person("Santa", "Class", height=175, size=xxl());
Person: person(
"Santa",
"Class",
size=xxl(),
height=175)
rascal>jsonExample = asJSON(example, dateTimeFormat="YYYY-MM-DD");
str: "{\"firstName\":\"Santa\",\"lastName\":\"Class\",\"size\":\"xxl\",\"height\":175}"
───
{"firstName":"Santa","lastName":"Class","size":"xxl","height":175}
───
On the way back we can also track origins for constructors:
rascal>parseJSON(#Person, jsonExample, trackOrigins=true)
Person: person(
"Santa",
"Class",
size=xxl(),
src=|unknown:///|(0,66,<1,0>,<1,66>),
height=175)
Benefits
- Using the `expected`` type arguments of Parse J S ON and Read J S ON the homonyms created by As J S ON and Write J S ON can be converted back to their
original Rascal structures. If the expected type contains only concrete types, and no abstract types then
then pairs As J S ON/Parse J S ON and Write J S ON/Read J S ON are isomorphic.
- The abstract types are
value,node,numor any composite type that contains it.Maybe[value]is an example of an abstract type. - The concrete types are all types which are not abstract types.
Maybe[int]is an example of a concrete type. - Run-time values always have concrete types, while variables in code often have abstract types.
- The abstract types are
- If you provide
valueornodeas an expected type, you will always get a useful representation on the Rascal side. It is not guaranteed to be the same representation as before. - Read J S ON and Parse J S ON can read numbers beyond the bounds of normal
int,longanddoubles. As long as the number is syntactically correct, they will bind to the right Rascal number representation.
Pitfalls
- As J S ON and Write J S ON are not isomorphisms. They are homomorphisms that choose JSON arrays or JSON objects for multiple different kinds of Rascal values. For example maps and nodes and ADT's are all mapped to JSON object notation (homonyms).
- JSON is a serialization format that does not deal with programming language numerical
encodings such as
double,float`` orlong`. Write J S ON and As J S ON might write numbers beyond the limits and beyond the accuracy of what the other programming language can deal with. You can expect errors at the time when the other language (Javascript, Python, TypeScript) reads these numbers from JSON.
data RuntimeException
JSON parse errors have more information than general parse errors
data RuntimeException (str cause="", str path="")
locationis the place where the parsing got stuck (going from left to right).causeis a factual diagnosis of what was expected at that position, versus what was found.pathis a path query string into the JSON value from the root down to the leaf where the error was detected.
function readJSON
reads JSON values from a stream
&T readJSON(
type[&T] expected,
loc src,
str dateTimeFormat = DEFAULT_DATETIME_FORMAT,
bool lenient=false,
bool trackOrigins=false,
JSONParser[value] parser = (type[value] _, str _) { throw ""; },
map [type[value] forType, value nullValue] nulls = defaultJSONNULLValues,
bool explicitConstructorNames = false,
bool explicitDataTypes = false
)
In general the translation behaves as follows:
- Objects translate to map[str,value] by default, unless a node is expected (properties are then translated to keyword fields)
- Arrays translate to lists by default, or to a set if that is expected or a tuple if that is expected. Arrays may also be interpreted as constructors or nodes (see below)
- Booleans translate to bools
- If the expected type provided is a datetime then an int instant is mapped and if a string is found then the dateTimeFormat parameter will be used to configure the parsing of a date-time string
- If the expected type provided is an ADT then this reader will try to "parse" each object as a constructor for that ADT. It helps if there is only one constructor for that ADT. Positional parameters will be mapped by name as well as keyword parameters.
- If the expected type provided is a node then it will construct a node named "object" and map the fields to keyword fields.
- If num, int, real or rat are expected both strings and number values are mapped
- If loc is expected than strings which look like URI are parsed (containing :/) or a file:/// URI is build, or if an object is found each separate field of a location object is read from the respective properties: { scheme : str, authority: str?, path: str?, fragment: str?, query: str?, offset: int, length: int, begin: [bl, bc], end: [el, ec]}
- Go to JSONParser to find out how to use the optional
parsersparameter. - if the parser finds a
nullJSON value, it will lookup in thenullsmap based on the currently expected type which value to return, or throw an exception otherwise. First the expected type is used as a literal lookup, and then each value is tested if the current type is a subtype of it.
function parseJSON
parses JSON values from a string. In general the translation behaves as the same as for ((readJSON)).
&T parseJSON(
type[&T] expected,
str src,
str dateTimeFormat = DEFAULT_DATETIME_FORMAT,
bool lenient=false,
bool trackOrigins=false,
JSONParser[value] parser = (type[value] _, str _) { throw ""; },
map[type[value] forType, value nullValue] nulls = defaultJSONNULLValues,
bool explicitConstructorNames = false,
bool explicitDataTypes = false
)
function writeJSON
Serializes a value as a JSON string and stream it
void writeJSON(loc target, value val,
bool unpackedLocations=false,
str dateTimeFormat=DEFAULT_DATETIME_FORMAT,
bool dateTimeAsInt=false,
int indent=0,
bool dropOrigins=true,
JSONFormatter[value] formatter = str (value _) { fail; },
bool explicitConstructorNames=false,
bool explicitDataTypes=false
)
This function tries to map Rascal values to JSON values in a natural way. In particular it tries to create a value that has the same number of recursive levels, such that one constructor maps to one object. The serialization is typically lossy since JSON values by default do not explicitly encode the class or constructor while Rascal data types do.
If you need the names of constructors or data-types in your result, then use the parameters:
explicitConstructorNames=truewill store the name of every constructor in a field_constructorexplicitDataTypes=truewill store the name of the ADT in a field called_type- Check out JSONFormatter on how to use the
formattersparameter - The
dateTimeFormatparameter dictates howdatetimevalues will be printed. - The
unpackedLocationsparameter will produce an object with many fields for every property of alocvalue, but if set to false alocwill be printed as a string.
Pitfalls
- It is understood that Rascal's number types have arbitrary precision, but this is not supported by the JSON writer.
As such when an
intis printed that does not fit into a JVMlong, there will be truncation to the lower 64 bits. Forrealnumbers that are larger than JVM's double you get "negative infinity" or "positive infinity" as a result.
function asJSON
Serializes a value as a JSON string and stores it as a string
Does what ((writeJSON)) does but serializes to a string instead of a location target.
str asJSON(value val, bool unpackedLocations=false, str dateTimeFormat=DEFAULT_DATETIME_FORMAT, bool dateTimeAsInt=false, int indent = 0, bool dropOrigins=true, JSONFormatter[value] formatter = str (value _) { fail; }, bool explicitConstructorNames=false, bool explicitDataTypes=false)
This function uses writeJSON and stores the result in a string.
alias JSONFormatter
((writeJSON)) and ((asJSON)) uses Formatter functions to flatten structured data to strings, on-demand
str (&T)
A JSONFormatter can be passed to the Write J S ON and As J S ON functions. When/if the type matches an algebraic data-type to be serialized, then it is applied and the resulting string is serialized to the JSON stream instead of the structured data.
The goal of JSONFormat and its dual JSONParser is to bridge the gap between string-based JSON encodings and typical Rascal algebraic combinators.
alias JSONParser
((readJSON)) and ((parseJSON)) use JSONParser functions to turn unstructured data into structured data.
&T (type[&T], str)
A parser JSONParser can be passed to Read J S ON and Parse J S ON. When the reader expects an algebraic data-type or a syntax type, but the input at that moment is a JSON string, then the parser is called on that string (after string.trim()).
The resulting data constructor is put into the resulting value instead of a normal string.
The goal of JSONParser and its dual JSONFormatter is to bridge the gap between string-based JSON encodings and typical Rascal algebraic combinators.
Benefits
- Use parsers to create more structure than JSON provides.
Pitfalls
- The
type[&T]argument is called dynamically by the JSON reader; it does not contain the grammar. It does encode the expected type of the parse result. - The expected types can only be
datatypes, not syntax types.