LQN is a Common Lisp libary, query language and terminal utility to query and
transform text files such as JSON
and CSV
, as well as Lisp data (LDN
), The
terminal utilities will parse the input data to internal lisp structures
according to the input mode. Then the lqn
query language can be used for queries
and transformations.
lqn
started as an experiment and programming exercise. But it has turned into
a little language i find rather useful. Both in the terminal, and
more interestingly, as a meta language for writing macros in CL. The main
purpose of the design is to make something that is intuitive, terse, yet
flexible enough that you can write generic CL if you need to. I also wanted to
make something that requres a relatively simple compiler.
Here is a small tutorial: https://inconvergent.net/2024/lisp-query-notation/.
An expanded version of the tutorial can be seen in this paper: https://zenodo.org/records/11001584
When using LQN
on the terminal there are three terminal commands, or input
modes: jqn
, tqn
and lqn
. For JSON
, text and lisp data respectively.
(For installation see below.) You can find some terminal command examples in
bin/lqn-sh.lisp, bin/jqn-sh.lisp, and bin/tqn-sh.lisp.
Symbol documentation can be seen in docs/lqn.md.
Internally JSON
arrays/lists are represented as vectors
. and JSON
objects/dicts are represented as hash-tables
(ht
). Thus a text file is a
vector
of strings
.We use object
in the context of Operators and other
LQN
utilities to refer to either a vector
or a ht
. Lisp data is read
directly.
The following operators have special behaviour. You can also write generic CL
code in almost all contexts, as we demonstrate soon. In operators we use _
to
refer to the current value.
In the following sections [d]
represents an optional default value. E.g. if
key/index is missing, or if a functon would otherwise return nil
.
k
is an initial counter
value. Whereas ..
means that there can be arbitrary arguments/expr
.
expr
denotes any expression or operator; like (+ 1 _)
or #[:id]
.
In operators, and several functions, :keywords
can be used to represent
lowercase strings
. This is useful in the terminal to avoid escaping strings.
Particularly when using Selector
operators. You can use "Strings"
instead,
if you need case or whitespace.
(|| expr ..)
pipes the results from the first expr
to the second, and so
on. Returns the result of the last expr
. The Pipe operator surrounds all
queries by default. So it is usually not neccessary to use it explicitly.
For convenience the pipe has the following default translations:
fx
: to(?map (fx _))
: mapfx
across all items.:word
: to[(isub? _ "word")]
to filter by"word"
."Word"
: to[(sub? _ "Word")]
to filter byWord
, case sensitive.(..)
: to itself. That is, expressions are not translated. so this is the default transalation for top level expressions in any query.
Select :keys
, indexes or paths from nested structure:
(@ k)
: get this key/index/path from current value.(@ k [d])
: get this key/index/path from current value.(@ o k [d])
: get this key/index/path fromo
.
Paths support wildcards (*
) and numerical indices for nested structures. E.g. this
is a valid path: :*/0/things
.
Map operations over vector
; or over the values of a ht
:
#(fx)
: map(fx _)
across all items.#(expr ..)
: evaluate these expressions sequentially on all items insequence
.
Select from one structure into a new data structure. using selectors:
{s1 sel ..}
: fromht
into newht
.#{s1 sel ..}
: fromvector
ofhts
into newvector
ofhts
.#[s1 sel ..]
: fromvector
ofhts
into newvector
.
A selector is a triple (mode key expr)
. Only key is required. If expr
is
not provided the expr
is _
, that is: the value of the key
. The modes are
as follows:
+
: always include thisexpr
. [default]?
: includeexpr
if the key is present and notnil
.%
: include Selector ifexpr
is notnil
.-
: drop this key in#{}
and{}
operators; ignore Selector entirely in#[]
E.g.{_ -@key3}
to select all keys exceptkey3
.expr
is ignored.
Selectors can either be written out in full, or they can be be written in short
form depending on what you want to achieve. The @
in the following examples
is used to append a mode to a key without having to wrap the Selector in
parenthesis. If you need eg. case or spaces you can use "strings"
. Here are
some examples using {}
. It behaves the same for the other Selector
operators:
{_} ; select all keys.
{_ :-@key1} ; select all keys except "key1".
{:key1 "Key2"} ; select "key1" and "Key2".
{:+@key} ; same as :key [+ mode is default].
{"+@Key"} ; select "Key".
{:?@key } ; select "key" if the value is not nil.
{(:%@key expr)} ; select "key" if expr is not nil.
{("?@Key" expr)} ; select "Key" if the value is not nil.
{("%@Key" expr)} ; select "Key" if expr is not nil.
{(:+ "Key" expr)} ; same as ("+@Key" expr).
; Use `_` in `expr` to refer to the value of the selected key:
{(:key1 sup)) ; convert value of "key1" to uppercase
(:key3 (or _ "That")) ; select the value of "key3", or literally "That".
(:key2 (+ 33 _))} ; add 33 to value of "key2"
; override and drop keys:
{_ ; select all keys, then override these:
(:key2 (sdwn _)) ; lowercase the value of "key2"
:-@key3} ; drop "key3"
We use {}
in the examples but all Selector operators have the same behaviour.
Filter vector
; or the values of a ht
:
[expr1 .. exprn]
to keep any object or value that satisfies the expressions.
The filter operator behaves somewhat similar to the Selector operators. They are used
with []
, ?srch
, ?xpr
, ?txpr
, ?mxpr
operators. The modes behave
like this:
+
: if there are multiple expressions with+
mode, require ALL of them to be satisfies.?
: if there are any clauses with?
mode, it will select items where either of these clauses is satisfied-
: items that match any clause with-
mode will ALWAYS be dropped.
If this is not what you need, you can compose boolean expressions with regular CL boolen operators. Here are some examples:
[:hello] ; strings containing "hello".
[:hi "Hello"] ; strings containing either "Hello" OR "hi".
[:+@hi :+@hello] ; strings containing "hi" AND "hello".
[:+@hi :+@hello "OH"] ; strings containing ("hi" AND "hello") OR "OH".
[int!?] ; items that can be parsed as int.
[(> _ 3)] ; numbers larger than 3.
[_ :-@hi] ; strings except those that contain "hi".
[(+@pref? _ "start") ; strings that start with "start" and end with "end".
(+@post? _ "end")]
[(fx1 _)] ; items where this expression is not nil.
[(or (fx1 _) (fx2 _))] ; ...
Reduce vector
; or the values of a ht
:
(?fld init fx)
: fold(fx acc _)
withinit
as the firstacc
value.acc
is inserted as the first argument tofx
.(?fld init (fx .. _ ..))
: fold(fx acc .. _ ..)
. The accumulator is inserted as the first argument tofx
.(?fld init acc (fx .. acc .. nxt))
: fold(fx .. acc .. nxt)
. Use this if you need to name the accumulator explicity.
Group input into a new ht
:
(?grp expr [tx-expr])
: keys are given byexpr
, and values are given bytx-expr
(or_
).
Repeat the same expression while something is true:
(?rec test-expr expr)
: repeatexpr
whiletest-expr
._
refers to the input value, then to the most recent evaluation ofexpr
. Use(cnt)
to get the number of the current iteration.(par)
always refers to the input value.
Iterate a datastructure (as if with ?txpr
) and collect the matches in a new
vector
:
(?srch sel)
: collect_
whenever theSelector
matches.(?srch sel .. expr)
: collectexpr
whenever theSelector
matches.
Perform operation when pattern or condition is satisfied:
(?xpr sel)
: match current value againstEXPR Selector
. Return the result if notnil
.(?xpr sel hit-expr)
: match current value againstEXPR Selector
. Evaluateshit-expr
if not nil._
is the matching item.(?xpr sel .. hit-expr miss-expr)
: match current value againstexpr selectors
. Evaluatehit-expr
if notnil
; else evaluatemiss-expr
._
is the matching item.
Recursively traverse a nested structure of sequences
and hts
and return a
new value for each match:
(?txpr sel .. tx-expr)
: recursively traverse current value and replace matches withtx-expr
.tx-expr
can be a function name or expression. Also traverses vectors andht
values.(?mxpr (sel .. tx-expr) .. (sel .. tx-expr))
: one or more matches and transforms. Performs the transform of the first match only.
The internal representation of in lqn
means you can use the regular CL
utilities such as gethash
, aref
, subseq
, length
etc. But for
convenience there are some utility functions/macros in defined in lqn
. Some
of them are described below. There are more in the documentation.
Defined in the query scope:
(fi [k=0])
: counts files fromk
.(fn)
: name of the current file; or":internal:"
,"pipe"
.(hld k v)
: hold this value at this key in a key value store.(ghv k [d])
: get the value of this key; ord
.(nope [d])
: stop execution, returnd
.(err [msg])
: raiseerror
withmsg
.(wrn [msg])
: raisewarn
withmsg
.
Defined in all operators:
_
: the current value.(cnt)
: counts from0
in the enclosingSelector
.(key)
: the currentkey
if the current value is aht
. Otherwise(cnt)
.(itr)
: the current object in the iteration of the enclosingSelector
.(par)
: the object containing(itr)
.(psize)
: number of items in(par)
.(isize)
: number of items in(itr)
.
General utilities:
(?? a expr [res=expr])
: executeexpr
only ifa
is notnil
. ifexpr
is not nil it returnsexpr
orres
; otherwisenil
.(fmt f ..)
: formatf
asstring
with these (format
) args.(fmt s)
: get printed representation ofs
.(out f ..)
: formatf
to*standard-output*
with these (format
) args. returnsnil
.(out s)
: output printed representation ofs
to*standard-output*
. returnsnil
.(msym? a b)
: comparesymbol
a
tob
; ifb
is akeyword
orsymbol
a perfect match is required; ifb
is astring
it performs a substring match; ifb
is an expression,a
is compared to the evaluated value ofb
.(noop ..)
: do nothing, returnnil
.
For all sequences
and hts
:
(@* o d i ..)
: pick these indices/keys fromsequence
/ht
into newvector
.(size? o [d])
: length ofsequence
or number of keys inht
.(all? o [empty])
: are all items insequence
something? orempty
.(some? o [empty])
: are some items insequence
something? oremtpy
.(empty? o [d])
: issequence
orht
empty?.(compct o)
: Removenil
, emptyvectors
, emptyhts
and keys with emptyhts
.
Make or join hts
:
(cat$ ..)
: add all keys from thesehts
to a newht
. left to right.(new$ :k1 expr1 ..)
: newht
with these keys and expressions.
Primarily for sequences
(string
, vector
, list
):
(new* ..)
: newvector
with these elements.(ind* s i)
: get this index fromsequence
.(sel ..)
: get newvector
with theseind*s
orseqs
fromsequence
.(seq v i [j])
: get rangei ..
ori .. (1- j)
fromsequence
.(head s [n=10])
: firstn
items ofsequence
.(tail s [n=10])
: lastn
items ofsequence
.(cat* s ..)
: concatenate thesesequences
to avector
.(flatn* s [n=1] [str=nil])
: flattensequence
n
times into avector
. Ifstr=t
strings are flattened into individual chars as well.(flatall* s [str=nil])
: flatten allsequences
(exceptstrings
) into newvector
. Uset
as the second argument to flattenstrings
to individual chars as well.(flatn$ s n)
: flattenht
into vector(new* k0 v0 k1 v1 ..)
Primarily for string
searching. [i]
means case insensitive:
([i]pref? s pref [d])
:s
ifpref
is a prefix ofs
; ord
.([i]sub? s sub [d])
:s
ifsub
is a substring ofs
; ord
.([i]subx? s sub)
: index wheresub
starts ins
.([i]suf? s suf [d])
:s
ifsuf
is a suffix ofs
; ord
.(repl s from to)
: replacefrom
withto
ins
.
String maniuplation:
(sup s ..)
:str!
and upcase.(sdwn s ..)
:str!
and downcase.(trim s)
: trim leading and trailing whitespace fromstring
.(splt s x [trim=t] [prune=nil])
: splits
at allx
intovector
ofstrings
.trim
removes whitespace.prune
drops empty strings.(join s x ..)
: join sequence withx
(strings
orchars
), returnsstring
.(strcat s ..)
: concatenate thesestrings
, or allstrings
in one or moresequences
ofstrings
.
(is? o [d])
returns o
if not nil
, empty sequence
, or empty ht
; or d
.
These functions return the argument if the argument is the corresponding type:
flt?
, int?
, ht?
, lst?
, num?
, str?
, vec?
, seq?
.
These functions return the argument parsed as the corresponding type if
possible; otherwise they return the optional second argument: int!?
, flt!?
,
num!?
, str!?
, vec!?
, seq!?
.
The following functions will coerce the argument, or fail if the coercion is
not supported: str!
, int!
, flt!
, lst!
sym!
,
lqn
requires SBCL. And is pretty easy to install via
quicklisp
. SBCL is available in most package managers. And you can get
quicklisp at https://www.quicklisp.org/beta/. Make sure lqn
is available in
your quicklisp
local-projects
folder. Mine is at
~/quicklisp/local-projects/
.
Then create an alias for SBCL to execute shell wrappers e.g:
alias jqn="sbcl --script ~/path/to/lqn/bin/jqn-sh.lisp"
alias tqn="sbcl --script ~/path/to/lqn/bin/tqn-sh.lisp"
alias lqn="sbcl --script ~/path/to/lqn/bin/lqn-sh.lisp"
Unfortunately this will tend to have a high startup time. To make it run faster
you can create an SBCL image/core that has lqn
preloaded and dump it using
sb-ext:save-lisp-and-die
. Then use the core in the alias instead of SBCL.
is an example script for creating your own core. You can also preload
your own libraries which will be available to lqn
.
You can see an example bash script for making your own core herebin/core.sh