-
Notifications
You must be signed in to change notification settings - Fork 46
Tutorial
Jelly is a tacit programming language. This means you define links (functions) by composing existing links into a chain, without explicitly talking about the arguments involved. Which way the arguments “flow” through this composition is defined by the pattern the links are arranged in. An example of this will be given soon, but first we’ll need to introduce some concepts.
The arity of a link is a very crucial concept. All of the atoms – the built-ins, like +
and ½
– have fixed arities. Links are sorted into three categories, depending on their arity:
-
Nilads take no arguments (arity 0); other than some I/O and stateful commands, they mostly represent constant values. For example, the literal
3
is a nilad. -
Monads take one argument (arity 1). (There’s no connection to functional programming monads here.) For example,
H
(halve) is a monad. -
Dyads take two arguments (arity 2): a left and a right argument. For example,
+
is a dyad.
(Using adjectives, we say that a link is niladic, monadic, or dyadic.)
So what’s the arity of the links we define when writing a program? By default, they are variadic – that is, it’s up to the caller to specify how many arguments to use, and in the case of the main link, it depends on how many arguments the program is passed.
As an example, +H
is a chain of +
(addition) and H
(halve). As the respective arities of the elements of this chain are 2 and 1, we call it a 2,1-chain. The interpreter has specific rules for breaking down chains, based on their arities: those rules dictate that, given a single argument n
, this new link computes n + n/2
. (You can read +H
as “... plus itself, halved.”). If two arguments were passed to the program instead, the link would be a dyadic chain, and it would compute (a + b) / 2
(where a is the first argument, and b the second).
Jelly programming, then, is essentially the art of learning these rules well, and composing clever chains that get the job done, tacitly.
Each line in a Jelly program is a link definition. Links are basically functions. The bottom line represents “main
”: it's the link that gets evaluated using the arguments passed on the command line.
All links but the last one, then, are function definitions: you can refer to them using quicks, which we’ll elaborate on later. For example, ç
is “the link above this one, as a binary operator (dyad)” (“above” really means “the line above”). Consider this example program, which, with two or more arguments, it computes the square of the sum of its arguments (here, we have several arguments, so our main link is a dyadic chain):
+
ç²
This is sort of like the pseudocode:
define f:
the built-in link +
define main:
apply the dyad f
square the result
Don’t worry about this too much, for now – just know that this is what Jelly programs’ structure is like.
We need to describe, of course, what characters can be used in a chain definition, and what they will mean. Jelly uses its own 256-byte code page. Most of the characters in it are allocated for built-in atoms, but some of them have special semantics.
Jelly values are either numbers (complex
/float
/int
), lists, or characters; this section explains how to construct constant values (literals) of those types in your code. Literals are always niladic (in the sense described earlier.)
Number literals take on various forms in Jelly. Here's an annotated EBNF-ish summary:
number ::= real Simple real literals.
| [real], "ı", [real]; (Real, default 0) + 1j * (real, default 1).
real ::= decimal Simple decimal literals.
| [decimal], "ȷ", [decimal]; (Decimal, default 1) * 10 ^ (decimal, default 3).
decimal ::= "0" Zero.
| ["-"], [digits], ".", [digits] (Integer, default 0) with fractional part (default "5").
| ["-"], digits Integer.
| "-"; Minus one.
digits ::= (? one or more digit 0-9 ?);
See also "code-page index lists" and "base-250 numbers" in the string literals section, below.
A character literal consists of ”
, followed by any character in the code page.
A string is a list of characters in Jelly. A string literal consists of “
, followed by a non-zero amount of non-«»‘’”
characters, followed by a terminator character in «»‘’”
. The terminators are interpreted as follows:
-
”
terminates a regular string, or a list of strings. The portion of the literal between the delimiters is split over“
. This yields a list of lists of characters. If there is only one list on the outer level (i.e. no extra“
s), the result is flattened into a single string; otherwise, a list of strings is returned. -
»
terminates a dictionary-compressed string. Jelly’s text codec is pretty complex. Take a look at this script, which compresses strings, and Jelly’ssss
, which decompresses them. -
‘
terminates a code-page index list. The result is a list of the (0-based) code page indices of each character - the last six characters in the code page,«»‘’“”
, are not valid content, thus the range is [0,249]. As with other strings these may be split using the“
character (e.g.“¡2“¢ż‘
evaluates to[[0, 50], [1, 249]]
). -
’
terminates a base-250 number. The result is the evaluation of the characters as digits of a base 250 number using the (1-based) code page indices without the last six characters - as such¡
represents 1,ẏ
represents 249 andż
represents 250. As an example“©ṭBF’
represents the digits [7, 225, 67, 71] which evaluates to123454321
. Once again“
may be used to split these into a list. -
«
is currently unassigned.
As a special shortcut, a two-character string literal consists of ⁾
, followed by any two characters in the code page. This means you can write e.g. “[]”
as ⁾[]
.
Another shortcut, ⁽
is also available to cut a byte or two from some relatively small integers.
The two bytes following ⁽
are evaluated as a base-250 number using the 1-based code-page indexes as digits and then mapped onto a split range: [1001=⁽¡¡, 1002=⁽¡¢, ..., 32249=⁽|ẏ, 32250=⁽|ż] + [-31349=⁽}¡, -31348=⁽}¢, ..., -101=⁽żẏ, -100=⁽żż]
Here is a program to translate to these literals, although do note that a few numbers may be made in a shorter way, depending on their surroundings (for example 2000=⁽¤ż
, could be 2ȷ
; and -255=⁽ż^
could be ⁹C
).
A comma-separated list of literals forms a list literal. For example, ”a,”b
and “ab”
are identical. Square brackets ([]
) may be used to form lists of lists, such as [12,”a],4ȷ
.
(Note that ,
is also the pair dyad: x,y
is the list of two elements [x, y]
. The programs 3¶1,2,¢
and 1,2,3
print different results — this doesn’t come up often, but it may be surprising. (¢
is “the previous definition, as a nilad”.))
How does Jelly evaluate a chain? As explained before, there are three cases to consider: whether this chain was called niladically, monadically, or dyadically.
These are the easiest of the bunch. To evaluate a niladic chain that starts with a nilad, like α F G H
, evaluate the monadic chain F G H
at that nilad α
. (Caveats: if the whole chain is empty, 0 is returned instead. If α
isn’t a nilad, evaluate the entire chain monadically at 0
instead.)
For example, 4H
is just H
evaluated at 4
, which is 2
.
Monadic chains are broken down from left to right, until there are no links left to consider. Also, we’re passed some argument ω
here. There are two questions to answer:
-
If our chain starts with a nilad
α
, and is followed by zero or more monads (likeH
), dyad–nilad pairs (like+2
), and nilad–dyad pairs (like4*
): we start by evaluatingα
, and then consider the rest of the chain. -
Otherwise, we start from the argument passed to this chain,
ω
, and consider the entire chain.
Let’s call v
the current value – initially, it’s the value described above, but it gets updated as we go through the chain – and denote
- nilads using digits,
- monads using uppercase letters,
- dyads using operator symbols
+
,×
,÷
.
Then the following patterns are matched against, from top to bottom:
Chain pattern | New v value |
---|---|
+ F ... |
v+F(ω) |
+ 1 ... |
v+1 |
1 + ... |
1+v |
+ ... |
v+ω |
F ... |
F(v) |
Let’s try this out on the chain
+²×
.
+
isn’t a nilad, so we start out atv = ω
.
- Then, we chop off
+²
, matching the first pattern, and getv = ω+ω²
. - Then, we chop off
×
, matching the fourth pattern, and getv = (ω+ω²)×ω
. - The chain is now empty, so
(ω+ω²)×ω
is our final result.
Sometimes, for certain chain-parsing patterns to match, a certain part of the chain had to consist of a nilad, followed by a list of monads and dyad–nilad/nilad–dyad pairs. In other words, the arities of the chain segment had to match the regex ^0(1|02|20)*
. Such a chain is a leading constant chain (LCC): it starts with a nilad that can't possibly be part of a dyad–nilad pair, and, therefore, must imply a new constant value. For example: 5
is a LCC, but 5+
isn’t.
They will come up in a few more places, so keep them in mind. Moving on!
These are basically like monadic chains, but this time, there are two arguments, λ
(left) and ρ
(right).
-
If the chain starts with three dyads like
+ × ÷
, we start atv = λ+ρ
, and consider the chain× ÷ ...
next. -
Otherwise, if the chain is an LCC starting with a constant
κ
, we start fromv = κ
, and consider the rest of the chain. -
Otherwise, we start from
v = λ
, and consider the entire chain.
This time, the patterns are
Chain pattern | New v value |
---|---|
+ × 1 ... |
(v+ρ)×1 * |
+ × ... |
v+(λ×ρ) |
+ 1 ... |
v+1 |
1 + ... |
1+v |
+ ... |
v+ρ |
F ... |
F(v) |
(* Only if 1 ...
is a LCC; i.e., 1
is a constant.)
Let’s try this out on the chain
+×÷H
.
- The chain starts with three dyads, so we start at
V = λ+ρ
, and throw away the+
.- Then, we chop off
×÷
, matching the second pattern, and getV = (λ+ρ)×(λ÷ρ)
.- Then, we chop off
H
, matching the sixth pattern, and getV = (λ+ρ)×(λ÷ρ)÷2
.- The chain is now empty, so we’re done.
Remember when I wrote that you define a link by making a chain of other links? I wasn’t telling the whole truth: in reality, it’s a two-layer process. A link is a chain of chains, and by default, the outer chain simply has unit length.
Consider this program:
C+H
That’s complement plus half. It takes an input value n
and calculates (1-n)+(n/2)
. Not too exciting, I know. But the structure is really like this:
The link we wrote is, itself, actually a chain containing a single chain.
Suppose that we want to calculate (1-n)+(1-n)(n/2)
instead. The dyadic chain +×
should help: by the chaining rules, it calculates λ+(λ×ρ)
, which looks a lot like what we need. However, simply replacing +
by +×
in our program won’t do: C+×H
is a 1,2,2,1-chain – complement, then add (the argument), then multiply by its half – computing ((1-n)+n)×(n/2)
.
We want Jelly to treat +×
as a unit, and make a 1,2,1-chain of the sub-chains C
, +×
, and H
. Multi-chain links let us do just that! To construct them, we use the chain separators øµðɓ
: in the image above, they would introduce a new blue rectangle, of arity 0, 1, 2 and 2*, respectively. In our case, we can group the chains the way we want by writing Cð+×µH
:
There’s no way to nest these things even further. You’ll have to define multiple links, instead.
* ɓ
is a version of ð
which also swaps the left and right arguments.
Quite often, your program will be a µ
-separated list of chain of monadic chains. It’s useful (to the author, at least) to think of such a program as a pipeline that “refocues” or replaces its input argument at every µ
, and uses this new argument tacitly within each sub-chain. To restore the very first input, you can use the built-in nilad ³
.
TODO: example.
Jelly link definitions are composed, at the lowest level, of tokens, and the Jelly parser pushes them to a list of chains when parsing your program. Certain tokens, however, have a very special meaning: they’re commands to the parser itself, either pushing a special value to the current chain, or operating on the links most recently pushed to the chain being parsed – somewhat like parse-time postfix operators.
As mentioned earlier, ç
is a (zero-argument) quick; it refers to the previous link definition as a dyad. It’s part of a whole family of quicks that refers to links:
Symbol | Definition |
---|---|
ß |
The current link. |
¢ |
The previous link (wrapping from the top link to the main link), as a nilad. |
Ç |
The previous link, as a monad. |
ç |
The previous link, as a dyad. |
Ñ |
The next link (wrapping from the main link back to the top link), as a monad. |
ñ |
The next link, as a dyad. |
Another family of quicks is ¤$¥
. These three look backwards through the list of recent links, and slices a sublist off as soon as it matches a certain pattern; then, it replaces that list with a single compound link. This table summarizes their behavior:
Symbol | Pattern | Result |
---|---|---|
¤ |
A nilad followed by one or more other links. | The matched list, as a niladic chain. |
$ |
At least two links, not forming an LCC. | The matched list, as a monadic chain. |
¥ |
At least two links, not forming an LCC. | The matched list, as a dyadic chain. |
These serve more or less the same purpose as øµð
, but they work “locally”: they operate on the links immediately before it, instead of splitting up the entire link definition. They’re often useful if you just need to combine two atoms into something that acts like a single atom in a jiffy. In fact, we could’ve written our Cð+×µH
example one byte shorter: C+×¥H
achieves the same thing, with ¥
wrapping +×
, and the resulting compound (+×¥
) functioning as a single dyad.
Jelly has a register! Yes, just one. The quick ©
will pop a link and replace it with one that is functionally identical, but saves its result in the register. Its current value can be retrieved using the built-in nilad ®
.
By default, many operations vectorize over lists: 4R
is the range [1, 2, 3, 4]
, and 4R²
squares each element in that range. You can modify a link to operate on just one index, or a list of indices, using the quick ¦
. It pops an index (list), and then a link, and pushes back one that works only where you want it to: ²1¦
squares the first element of its argument.
To apply a link multiple times, you can use ¡
. For example, H8¡
is the link “halve eight times”. ÆẠȷ¡
computes the Dottie number.
If the repeated link is a dyad, matters are more complicated:
- If a dyad is repeated 0 times, then the left argument is returned. For example,
4+0¡5
would return4
. - If a dyad is repeated 1 time, it's the same as just calling the dyad itself. For example,
7×1¡8
would return56
. - If a dyad is repeated n ≥ 2 times, the result is the dyad called with itself repeated n – 1 times as the left argument, and itself repeated n – 2 times as the right argument. For example,
6+2¡7
would return19
.
С
is like ¡
, but returns initial and intermediate results: 8H3С
returns [8,4,2,1]
, all steps in the process of halving 8
thrice.
A reduce accumulates some value out of an input list by applying some dyad to all of the values in it, from left to right. The quick /
pops a dyad, and pushes a monad that reduces over lists with the popped dyad. That is, if ç
is some dyad, then ç/
is a monad, and 1,2,3,4,5 ç/
is the same as 1ç2ç3ç4ç5
.
For example, »
is a dyad that returns the largest of its two arguments, and »/
is a monad that returns the largest element of an input list.
There are some useful variants of this operation:
-
\
is similar, but scans instead. This meansç\
returns the result of applyingç/
to each non-empty prefix of the argument. For example,4R+\
generates the first five triangular numbers ([1, 1+2, 1+2+3, 1+2+3+4]
). -
ç3/
repeatedly breaks off chunks of length min(3, length(argument)) and appliesç/
to each of them. For example,8R×3/
is[1×2×3, 4×5×6, 7×8] = [6, 120, 56]
. You can replace3
(the chunk size) with any nilad. -
ç3\
does something similar for all overlapping infixes of length 3:6R×3\
is[1×2×3, 2×3×4, 3×4×5, 4×5×6] = [6, 24, 60, 120]
. Again, you can replace3
(the infix size) with any nilad.
The quicks €
and Ѐ
map the preceding link over its left or right argument, respectively. This is useful for lists; instead of applying the dyad to the list itself, it will apply the dyad to each element of the list individually (against the other argument) and return a list of all the results. Essentially, you can use this quick to force vectorization on a link that does not normally vectorize. Here are some useful examples.
[1,2,3];[4,5,6]
This is concatenation, which concatenates the given lists and gives [1, 2, 3, 4, 5, 6]
.
[1,2,3];€[4,5,6]
The €
quick here causes the concatenation to map over the left argument - meaning it applies the concatenation to each element in the left argument against the given right argument. It gives [[1, 4, 5, 6], [2, 4, 5, 6], [3, 4, 5, 6]]
.
1+[1,2,3,4,5]
This is the normal addition which vectorizes. It has left depth and right depth 0, meaning that it acts on atoms of both arguments. Here, its left argument has depth 0 while its right argument has depth 1, so it is called on the left argument and each element of the right argument. In symbols, it becomes [1+1, 1+2, 1+3, 1+4, 1+5]
which evaluates to [2, 3, 4, 5, 6]
.
1+€[1,2,3,4,5]
The € makes the preceding dyad act on every element of the left argument. However, the left argument here is not iterable, so it is made to be an iterable by making it to be a range. 1 becomes [1]
. If it were 2, it would have become [1, 2]
. So, we now need to evaluate [1] +€ [1, 2, 3, 4, 5]
. As I have said, € makes the dyad act on each element of the left argument, independently. So, we need to now evaluate [1 + [1, 2, 3, 4, 5]]
. By the same token of the previous paragraph, we get [[2, 3, 4, 5, 6]]
. As singleton arrays are not displayed, it seems like the output is one-dimensional. Adding ŒṘ
to the end will make the truth apparent.
1+Ѐ[1,2,3,4,5]
The right argument is already iterable, so nothing is done in this stage. Then, the dyad is applied independently to each element of the right argument, causing we to have [1+1, 1+2, 1+3, 1+4, 1+5]
which evaluates to [2, 3, 4, 5, 6]
, being coincidentally the same with the first case.
For an example where all three are different, consider [1,3,5,7,9]+Ѐ[1,2,3,4,5]
vs. [1,3,5,7,9]+€[1,2,3,4,5]
vs. [1,3,5,7,9]+[1,2,3,4,5]
.