Skip to content

Jxl usage

MaxMotovilov edited this page Jun 17, 2011 · 9 revisions

How to use JXL to transform JSON data?

Note that a JXL program — and, strictly speaking, any useful jtlc input in general — cannot be represented in JSON, because many of its nodes are objects with non-trivial constructors and prototypes.

Introduction to JXL

JXL (JSON Transformation Language) — a specific template language implemented with jtlc (Javascript Template Language Compiler) for transforming and re-shaping JSON data. The language itself is also based on Javascript object notation but it is not JSON compliant: aside from the primitive types and object and array literals it also requires the use of functional tags which expand into non-anonymous — and invisible to the user of the library — Javascript objects.

Execution model

The execution of a JXL template is best described in terms of the dataflow programming model: values are generated by some primitives, are operated upon by others resulting in new values and end up in sinks provided by a third kind of primitives. The behavior of most languages entities depends on whether they appear in a context of a singleton or an iteration:

Singleton context

In singleton context only a single value is processed. All primitives with parameters operate as functions insofar as they have no side effects.

Iterative context

In iterative context the values are produced in sequence by the innermost generator, undergo processing imposed by the primitives enclosing the generator and ultimately end up in a sink which limits the context. Array is the most typical sink for the iterative contexts though it is also possible to populate dictionaries (anonymous Javascript objects) iteratively or aggregate the results to a single value. The generators typically iterate over arrays passed in as template’s arguments, produced by executing a query over input data or built by other parts of the same template; it is also possible to iterate over keys in a dictionary.

Usage examples

As JXL was designed to build upon and extend the capabilities provided by dojox.json.query rather than as a replacement for the latter library, the examples in this section concentrate on use cases where dojox.json.query in itself is not sufficient to do the job. The author hopes that simpler patterns of JXL use can be easily derived from these examples by the reader.

Grouping

Input Template Output
[
 {
   "title": "Alice in Wonderland", 
   "author": "Lewis Carroll"
 }, 
 {
   "title": "The Hunting of the Snark", 
   "author": "Lewis Carroll"
 }, 
 {
   "title": "The C++ Programming Language", 
   "author": "Bjarne Straustrup"
 }
]
[ t.group(
   "author",
   {
      author: "$[0].author",
      titles: [ t.expr( "title" ) ]
   },
   "[/author]"
)]
[
 {
  "author": "Bjarne Straustrup", 
  "titles": [
   "The C++ Programming Language"
  ]
 }, 
 {
  "author": "Lewis Carroll", 
  "titles": [
   "Alice in Wonderland", 
   "The Hunting of the Snark"
  ]
 }
]

Note that strings "author", "$[0].author" and "[/author]" are implicitly treated as expressions and a query, respectively, based on the execution context. At the same time, string “title” requires an explicit tag around it since it appears in an iterative context.

Using $[0] in an expression within group body is an easy way to access the key fields as they are equal for the entire group anyway and the group is guaranteed to have at least one element. Simple expressions referencing the properties of the current input do not have "$." in front of the name as it is implicitly assumed by expr().

Input Template Output
[
 {
  "city": "Dallas", 
  "country": "USA", 
  "continent": "North America"
 }, 
 {
  "city": "London", 
  "country": "UK", 
  "continent": "Europe"
 }, 
 {
  "city": "Kyiv", 
  "country": "Ukraine", 
  "continent": "Europe"
 }, 
 {
  "city": "New York", 
  "country": "USA", 
  "continent": "North America"
 }, 
 {
  "city": "Melbourne", 
  "country": "Australia", 
  "continent": "Australia"
 }
]
[ t.group(
  "continent",
  {
    continent: "$[0].continent",
    countries: [
      t.group(
        "country",
        {
          country: "$[0].country",
          cities: [ t.expr( "city" ) ]
        }
      )
    ]
  },
  "[/continent,/country]"
)]
[
 {
  "continent": "Australia", 
  "countries": [
   {
    "country": "Australia", 
    "cities": [
     "Melbourne"
    ]
   }
  ]
 }, 
 {
  "continent": "Europe", 
  "countries": [
   {
    "country": "UK", 
    "cities": [
     "London"
    ]
   }, 
   {
    "country": "Ukraine", 
    "cities": [
     "Kyiv"
    ]
   }
  ]
 }, 
 {
  "continent": "North America", 
  "countries": [
   {
    "country": "USA", 
    "cities": [
     "Dallas", 
     "New York"
    ]
   }
  ]
 }
]

For optimal performance, only one sorting on a composite key occurs here. The inner group() operates on the current value (sequence of records with identical continent key) of the outer one so it does not need the third argument.

Aggregation

Input Template Output
[
 {
  "item": "Beer", 
  "price": 1
 }, 
 {
  "item": "Pizza", 
  "price": 5
 }, 
 {
  "item": "Porsche", 
  "price": 100000
 }
]
t.last( t.expr(
  "{total:$1+=$.price," +
  "max_price:" +
    "$2=$2<$.price?$.price:$2}",
  t.current(),
  t.acc(0),
  t.acc("$[0].price")
))
{
 "total": 100006, 
 "max_price": 100000
}

In this example, using an object literal at JXL level to form the output records is not possible since it establishes an execution context of its own. Hiding it inside an expression solves the problem for simple cases. More complex aggregation patterns may require external accumulators implemented using an object or a closure:

Input Template Output
[
 {
  "item": "Beer", 
  "price": 1
 }, 
 {
  "item": "Pizza", 
  "price": 5
 }, 
 {
  "item": "Porsche", 
  "price": 100000
 }
]
t.last( t.each( {
  total: t.bind(sum(0),"price"),
  max_price: t.bind(max(0),"price")
} ) )
function sum( acc ) {
  return function( v ) {
    return acc += v;
  }
}
function max( acc ) {
  return function( v ) {
    return acc =
      Math.max( acc, v );
  }
}
{
 "total": 100006, 
 "max_price": 100000
}

Here, each() is used to evaluate the object literal for each input array element (iteration over current input is assumed by each() when it is given only one argument).

Flattening nested arrays

Input Template Output
[
 {
  "author": "Bjarne Straustrup", 
  "titles": [
   "The C++ Programming Language"
  ]
 }, 
 {
  "author": "Lewis Carroll", 
  "titles": [
   "Alice in Wonderland", 
   "The Hunting of the Snark"
  ]
 }
]
[ t.each(
    t.expr(
      "{author:$1,title:$}",
      t.from("titles"),
      "author"
    )
)]
[
 {
  "author": "Bjarne Straustrup", 
  "title": "The C++ Programming Language"
 }, 
 {
  "author": "Lewis Carroll", 
  "title": "Alice in Wonderland"
 }, 
 {
  "author": "Lewis Carroll", 
  "title": "The Hunting of the Snark"
 }
]

Again, it is much easier to move object construction inside the inline expression than to deal with an object literal with a context of its own.

Input Template Output
[
 {
  "continent": "Australia", 
  "countries": [
   {
    "country": "Australia", 
    "cities": [
     "Melbourne"
    ]
   }
  ]
 }, 
 {
  "continent": "Europe", 
  "countries": [
   {
    "country": "UK", 
    "cities": [
     "London"
    ]
   }, 
   {
    "country": "Ukraine", 
    "cities": [
     "Kyiv"
    ]
   }
  ]
 }, 
 {
  "continent": "North America", 
  "countries": [
   {
    "country": "USA", 
    "cities": [
     "Dallas", 
     "New York"
    ]
   }
  ]
 }
]
[ t.each(
    t.expr(
      "{continent:$1.continent," +
       "country:$1.country," +
       "city:$}",
      t.from("cities"),
      t.current()
    ),
    t.expr(
      "{continent:$1," +
       "country:$.country," +
       "cities:$.cities}",
       t.from("countries"),
       "continent"
    ),
    t.current()
)]
[
 {
  "continent": "Australia", 
  "country": "Australia", 
  "city": "Melbourne"
 }, 
 {
  "continent": "Europe", 
  "country": "UK", 
  "city": "London"
 }, 
 {
  "continent": "Europe", 
  "country": "Ukraine", 
  "city": "Kyiv"
 }, 
 {
  "continent": "North America", 
  "country": "USA", 
  "city": "Dallas"
 }, 
 {
  "continent": "North America", 
  "country": "USA", 
  "city": "New York"
 }
]

Note that the second expression does create an intermediate copy: it is necessary to pass multiple properties to the innermost iteration. This copy however is shallow and performance impact of it should be minimal.

Transforming dictionaries

The example above takes a simple record and transforms it into a data structure suitable to populate the names and values for controls in a form template.

Simple joins

At the moment, JXL lacks comprehensive support for SQL-like joins. Nevertheless some simple yet important cases — such as lookup by an unique key — can be implemented in a straightforward manner:

Input Template Output
{
 "firstName": "John", 
 "lastName": "Smith", 
 "occupation": "Software engineer"
}
{
  _: t.many(
       t.expr(
         "{name:$,value:$1[$]}",
         t.setkey( "$", t.keys() ),
         t.current()
       )
     )
}{
 "firstName": {
  "name": "firstName", 
  "value": "John"
 }, 
 "lastName": {
  "name": "lastName", 
  "value": "Smith"
 }, 
 "occupation": {
  "name": "occupation", 
  "value": "Software engineer"
 }
}
[
 {
  "city": "Dallas", 
  "country": "USA", 
  "continent": "North America"
 }, 
 {
  "city": "London", 
  "country": "UK", 
  "continent": "Europe"
 }, 
 {
  "city": "New York", 
  "country": "USA", 
  "continent": "North America"
 }, 
 {
  "city": "Kyiv", 
  "country": "Ukraine", 
  "continent": "Europe"
 }, 
 {
  "city": "Melbourne", 
  "country": "Australia", 
  "continent": "Australia"
 }

The association of the two tables relies on an intermediate dictionary with the key field of the join serving as the key in the dictionary. The same approach can be extended to compound keys by using replace() to build their string representations.

JXL Reference

see JXL Primitives