cK is an interpreter written in K which implements the concatenative program-construction semantics of Manfred von Thun's Joy programming language. cK implements all the K datatypes and primitives, as well as the principal Joy operators and combinators.
The intended use of cK is two-fold: to explore array programming in the framework of Joy, and to explore lambda-free programming in K.
In K, explicit loops are eliminated by drawing from a small fixed set of "adverbs": simple iteration patterns which hide the details of iteration, such as indexing with loop-counters and testing for end-of-list. The payoff comes in both performance (array operations can run at better-than-c speeds) and in code-size and -complexity.
In Joy, explicit recursions are eliminated by drawing from a small fixed set of "combinators": simple recursion patterns which hide the details of recursion. The payoffs are similar to those noted for K.
The purpose of this document is to provide a general overview of the primitives of K from the point of view of the Joy programmer.
The atomic datatypes of K are integer (1), float (2), character (3), symbol (4), dictionary (5), null (6), and lambda (7). In cK, integers, floats, and symbols are written as they are in K:
12 12.345 `foo
Characters use a single quote:
'X
Dictionaries are written using parentheses:
([`a 10][`b 20][`c 30])
Null is written using the letter 'N':
N
A vector is a list all of whose elements are atoms of the same type < 5. There are four vector datatypes: integer (-1), float (-2), character (-3), and symbol (-4). Each vector has an empty, represented by a letter:
I F C S
In K, a single character is written
"X"
but in cK this is a character vector of length 1.
A list which is not a vector has type 0. In cK, a list is written using square brackets:
[] [1] [1 2 3]
The datatypes of K and Joy overlap, but aren't completely equivalent. Integers, floats, and characters exist in both languages. K has symbols, Joy does not. K has lambdas, which of course do not exist in Joy. K also has a primitive dictionary datatype, which Joy lacks. On the other hand, Joy has a 'set' datype, which is absent from K. Both languages support lists. Joy distinguishes lists of character atoms and strings, whereas K identifies a string with a list of characters. Joy has a boolean datatype - true and false - while K uses 1 and 0 (or 0 and non-zero).
Dictionaries are implemented as atoms in K. This is paradoxical, although perhaps the best way to view things is that the K implementation of dictionaries is incomplete. A list is an aggregate whose parts are data. We index a list positionally, with a non-negative integer index. Lists are inherently ordered. A dictionary is an aggregate whose parts are variables. A variable is a triple, consisting of a symbolic name, a value, and a further dictionary of attributes. We index a dictionary by name, with a symbol which is a valid K name. Dictionaries are not inherently ordered (although in K, the order of variables in a dictionary is by creation time.)
In K, dictionaries are constructed from lists of symbol-value pairs or triples (the dictionary of attributes may be null.) The dictionary constructor is '.':
d:.((`a;10;.((`x;11);(`y;12)));(`b;20);(`c;30)) d`a 10
'.' is also the dictionary deconstructor:
m:. d m 0 (`a 10 .((`x;11;) (`y;12;)))
The attribute dictionary of a variable v in a dictionary can be indexed out by appending a dot to the variable name:
d`a. .((`x;11;) (`y;12;)) d[`a.;`x] 11
All the attribute dictionaries in a dictionary can be indexed out by indexing with the '.' function:
d[.] (.((`x;11;) (`y;12;)) )
In cK we have corresponding operations:
([`a 10 ([`x 11][`y 12])] [`b 20] [`c 30]) `d set; d `a @ 10 d .: `m set; m 0 [`a 10 ([`x 11 N] [`y 12 N])] d `a. @ ([`x 11 N] [`y 12 N]) d [.] . [([`x 11 N] [`y 12 N]) N N]
K recognizes three parts of speech: nouns, verbs, and adverbs.
A noun is any piece of data, including lambdas: 10, "abc", {x+y}, 4 5 6 are all nouns.
A verb is one of the 20 ascii symbols
~ ! @ # $ % ^ & * _ - + = | : , < . > ?
Grammatical context determines whether a verb is intransitive -- it takes a single argument -- or transitive -- it takes two arguments. An intransitive verb is written to the left of its argument:
%3
A transitive verb is written between its arguments:
2%3
An adverb is one of the six symbols
' / \ ': /: \:
An adverb is written to the right of its verb or noun operand:
a +' b f/a 10 -:' 3 8 2 19 5
K expressions parse from right to left, which means that they read from left to right, Using parentheses to denote the order in which subexpressions are evaluated:
2+-x*y%+/z
2+(-(x*(y%((+/)z))))
In cK, the dyadic verbs of K are written using the standard K glyphs: + - * &c. The monadic verbs are always written with a suffixed colon: +: -: *: &. Additionally, the commuted form of the dyad is written with a suffixed period: +. -. *. &c.
Functions denoted by the verbs of K are either atomic or list.
The atomic monads of cK are
~: not $: format %: reciprocal _: floor -: negate /: integer reciprocal (for compatibility with Joy)
An atomic monad penetrates its argument to the atoms and returns a result of the same shape and depth:
[[1 2 3] 4 [5]] -: [[-1 -2 -3] -4 [-5]]
In Joy, the behavior of the atomic monads can be simulated with the treemap combinator:
[[1 2 3] 4 [5]] [0 -] treemap [[-1 -2 -3] -4 [-5]]
The dyadic monads of cK are:
% divide ^ power & min, logical and * multiply - subtract + add = equal | max, logical or / integer divide (for compatibility with Joy) < less > more $ cast
An atomic dyad also penetrates its arguments to the atoms, but the rule here is slightly more complex than that for monads.
Consider x y d, where d is an atomic dyad. If x and y are both atoms, then the result z is an atom. Otherwise + recurses: [x y][d] each. The recursion fails with a "length error" if x and y are lists of different length.
Here are a few examples:
1 2 + 3 [1 2 3][4]+ [5 6 7] [1 2 3][4 5 6]+ [5 7 9] [[1 2 3][4 5 6]][10 20]+ [[11 12 13][24 25 26]]
Notice that in the last example + recurses on x and y to operate on [1 2 3] 10 and [4 5 6] 20.
[1 2 3][4 5]+ length error
The atomic dyads can be simulated with the treemap2 combinator:
[[2 3 4] 5] [[[6 7][8 9][10 11]] [[6 7 8]]] [+] treemap2 [[8 9] [11 12] [14 15]] [[11 12 13]]
There is exactly one left-atomic dyad:
! mod
x y ! penetrates x to the atoms, holding atomic y constant:
[0 1 2 3 4 5 6 7 8 9] 4 ! [0 1 2 3 0 1 2 3 0 1]
The list monads are:
!: enumerate @: atom #: count ^: shape &: where *: first +: flip =: group |: reverse ,: enlist <: up >: down ?: unique :: identity
Here are some examples:
3 !: \ enumerate [0 1 2] 0 !: I10 @: \ atomic? 1 [10 20] @: 0[1 2 3] #: \ count 3 [10] #: 1 10 #: 1 I #: 010 ^: \ shape I [1 2 3] ^: [3] [[1 2 3][4 5 6]] ^: [2 3] [[1 2 3][4 5]] ^: [2][0 1 1] &: \ where [1 2] [3 2 1 4] &: [0 0 0 1 1 2 3 3 3 3][10 20 30] *: \ first 10 10 *: 10[1 2 3] +: \ flip [1 2 3] [[1 2][3 4][5 6]] +: [[1 3 5][2 4 6]] [10 [1 2 3]] +: [[10 1][10 2][10 3]][10 20 10 20 20 30 10 20 10] =: \ group [[0 2 6 8][1 3 4 7][5]][10 20 30] |: \ reverse [30 20 10]10 ,: \ enlist [10] [10 20 30] ,: [[10 20 30]][10 20 10 30 30 10 20 30] <: \ up [0 2 5 1 6 3 4 7][10 20 10 30 30 10 20 30] >: \ down [3 4 7 1 6 0 2 5][10 20 30] :: \ identity [10 20 30]
The list dyads are:
~ match ! rotate # take _ cut, drop , join ? find : right
Here are some examples:
[1 2 3] [1 2 3] ~ \ match 1 [1 2]~[1 2 3] 01 [1 2 3] ! \ rotate [2 3 1] -2 [1 2 3 4] ! [3 4 1 2]2 [1 2 3] # \ take [1 2] -2 [1 2 3] # [2 3] 0 [1 2 3] # I2 [1 2 3] _ \ drop [2 3] [0 2 5] [1 2 3 4 5 6 7 8 9] _ \ cut [[1 2] [3 4 5] [6 7 8 9]][1 2 3] [4 5] , \ join [1 2 3 4 5] [] 1 , [1] [] [1] , [1][10 20 30 40 50] 20 ? \ find 1 [10 20 30 40 50] 3 ? 510 20 : \ right 20
Verbs take data as input and produce data. Adverbs take data and verbs as input and return verbs.
By means of valence- and datatype-overloading K is able to compress a variety of behaviors onto six adverb glyphs:
' each / over \ scan ': prior /: each right \: each left
In the examples which follow, the adverbs are used to derive verbs, which are then immediately applied to data:
+/1 2 3 6
But this may obscure the underlying semantics, in which the derived verb +/ is a first-class object, which can be assigned or otherwise manipulated as a noun:
f:+/ f 1 2 3 63#(+/) (+/;+/;+/)
Since cK, following Joy, represents programs as lists, we can follow suit:
[[+] iterate] `f def `f [1 2 3] f `f 6[[+] iterate] ,: 3 swap # [[[+] iterate] [[+] iterate] [[+] iterate]]
Observe the necessary enlistment, which turns the 2-list [[+] iterate] into a unit list.
The simplest of the iteration patterns is 'each', which takes a function f of valence n and derives a function (also of valence n) which, applied to n conformable lists, applies f to corresponding elements of each list. In K we might write:
1 2 3 +' 4 5 6 5 7 9
which of course, given the atomicity of +, is no different from 1 2 3 + 4 5 6. In cK we write this:
[[1 2 3][4 5 6]] [+] each [5 7 9]
An important difference between K and cK, which holds generally for the adverbs, is that the data operand is a single list of size n. In particular, for the case of unary f:
[[1 2 3]] [!:] each [[0] [0 1] [0 1 2]]
A common pattern: you have a list [.. a ..] and a list of programs [.. p ..] and you want to apply each p to the corresponding a, leaving .. r .. on the stack.
[1 2 3] [-: !: [!: 1 swap !]] [i] each
or use 'apply':
[1 2 3] [-: !: [!: 1 swap !]] apply -1 [0 1] [1 2 0]
If the program-list has size 1 it is treated as though it had size-a many elements.
A verb derived by means of the 'each' adverb performs parallel recursion on conformable data. Suppose you want to apply an operation to two pieces of data, recursing on just one and holding the other fixed. For example, you want to add the vector 10 20 30 to each vector in the list (1 2 3;4 5 6):
10 20 30 +/: (1 2 3;4 5 6) (11 22 33 14 25 36)
In cK we write:
[10 20 30] [[1 2 3][4 5 6]] [+] right [[11 22 33] [14 25 36]]
and correspondingly for the K adverb \:
[[1 2 3][4 5 6]] [10 20 30] [+] left [[11 22 33] [14 25 36]]
Note the use of partial evaluation and the 'map' combinator to simulate this behavior:
[10 20 30] + \ project the "add [10 20 30]" program [[10 20 30] +] [[1 2 3][4 5 6]] swap \ push list of vectors, swap [[1 2 3] [4 5 6]] [[10 20 30] +] map \ map the projection to each vector [[11 22 33] [14 25 36]]
The 'left' and 'right' combinators will "complete" their program from the stack:
10 [1 2 3] 4 [+*] left 10 [50 60 70]
If the stack is shorter than what the program implicitly expects, it will partially evaluate:
[1 2 3] 4 [+*] left [[5 *] [6 *] [7 *]]
Given a list, we want to apply a dyadic function to successive pairs:
-': 10 1 3 7 6 2 -9 2 4 -1 -4
That is,
1 - 10 -9 3 - 1 2 7 - 3 4 6 - 7 -1 2 - 6 -4
Note that the size of the result is one less than the size of the input. We have the option of supplying an initial value:
10 -': 1 3 7 6 2 10 2 4 -1 -4
In cK:
[10 1 3 7 6 2] [-] prior [-9 2 4 -1 -4][10 [1 3 7 6 2]] [-] prior [10 2 4 -1 -4]
In K, 'over' and 'scan' can take a variety of argument and operand types. The common idea is iteration of a function over data. We begin with an initial state, and then at each iteration-step, modify that state by applying the function to the state and further data. The 'scan' variant is the more fundamental of the two: it returns a list whose first element is the initial state, and whose subsequent elements are the intermediate states. "More fundamental" since 'over' returns only the last element of that list, the final state.
In cK, the different cases of 'over' and 'scan' are split out into five different keywords.
The underlying K pattern is:
fn/[a1;..;an]
where fn is a function of valence n > 1.
Let's examine the general case. Suppose fn is {x+y*z}, a function of valence 3. fn multiples y and z, and adds the result to x. In K parlance, we refer to x as the "state", and say that, in the 'over', we will repeatedly update the state by adding to it the result of multiplying y by z.
{x+y*z}/[10;1 2 3;4 5 6] 42
Let's step through the calculation:
On the first application, x = 10, y = 1, z = 4, so the result -- the new state -- is 14 = 10 + 1 * 4. On the second application, x = 14, y = 2, z = 5, so the result is 24 = 14 + 2 * 5. On the third and final application, the result is 24 + 3 * 6 = 42.
The intermediate steps can be retrieved by using 'scan' instead of 'over':
{x+y*z}\[10;1 2 3;4 5 6] 10 14 24 42
A few examples:
[1 2 3 4] [+] iterate 10
[10 [1 2 3 4]] [+] iterate 20
[10[1 2 3][4 5 6]] [*+] iterate 42
A pattern we replicate in cK is to denote the 'scan' version of the K adverb by capitalizing the first letter of its name:
[10[1 2 3][4 5 6]] [*+] Iterate [10 14 24 42]
The 'iterate' form of 'over' derives a looping pattern controlled by the structure of the data to be looped over. The verb operand must have valence greater than 1. If the verb operand has valence 1, then the looping pattern of the derived verb is controlled by a further parameter: a constant value (do), a boolean valued function (while), or a built-in condition on the result of applying the verb (converge).
5 {x+1}/10 15
In cK:
10 5 [1+] do 15
The 'scan' versions:
5 {x+1}\10 10 11 12 13 14 1510 5 [1+] Do [10 11 12 13 14 15]
{3>#x}{x,x+1}/10 / while 3 > size x append x+1 to x 10 11 11 12{3>#x}{x,x+1}\10 (10 10 11 10 11 11 12)
and in cK:
10 [#: 3 swap >] [dup 1 + ,] while \ #: <-> size, ',' is total concat [10 11 11 12]10 [#: 3 swap >] [dup 1 + ,] While [10] [[10 11]] [[10 11 11 12]]
The 'converge' adverb also takes a monadic function operand (or a vector - see below), but the looping pattern it derives is not controlled by an external parameter. Instead, the derived function is repeatedly applied to its argument until either of the following conditions is satisfied: the result matches the initial value, or it matches the previous result.
{_ x%10}\100000 100000 10000 1000 100 10 1 0
and in cK:
100000 [10 % _:] Converge [100000 10000 1000 100 10 1 0]
There is a special case of 'converge' in which its operand is a vector - a list of atoms all of which are of type 1, 2, 3, or 4. In this case, the argument to the derived function is always an integer, viz. an index into the vector. The behavior of the derived function is to repeatedly index the vector and use the result as a further index. Again, the loop terminates when the result either matches the value of the initial index or the previous result. Convergent "pointer-chasing" can be used to trace paths in a tree or DAG. Thus:
Suppose our tree is:
0 1 2 3 4 5 6 7 8 9 10 11
We can represent the structure of the tree as a vector of parent-indices:
v:0 0 1 1 0 4 5 5 4 8 8 8
I.e., the parent of 0 is 0, the parent of 1 is 0, the parent of 2 is 1, the parent of 3 is 1, &c. Now, to trace the path from a node n, e.g. 7:
v\7 7 5 4 0
and to trace all paths:
v\'!#v (,0 1 0 2 1 0 3 1 0 4 0 5 4 0 6 5 4 0 7 5 4 0 8 4 0 9 8 4 0 10 8 4 0 11 8 4 0)
In cK, first define v:
[0 0 1 1 0 4 5 5 4 8 8 8] `v set
Then:
7 v Converge [7 5 4 0]
We can use the 'left' combinator instead of 'each':
v dup #: !: swap [0 1 2 3 4 5 6 7 8 9 10 11] [0 0 1 1 0 4 5 5 4 8 8 8] [Converge] left [[0] [1 0] [2 1 0] [3 1 0] [4 0] [5 4 0] [6 5 4 0] [7 5 4 0] [8 4 0] [9 8 4 0] [10 8 4 0] [11 8 4 0]]
The 'iterate' combinator takes a function of valence n > 1 and applies it repeatedly to successive elements of n lists. 'converge' takes either a function of valence 1 or a vector and applies it to either an argument or an index, terminating when the result either matches the initial value or the previous result. In K, the single adverb 'over' distinguishes the cases and selects the appropriate behavior. There is one further case which K can distinguish, and which in cK has been split off into its own combinator: the object to apply is a matrix or nested list (i.e., not a vector), and the arguments are index-vectors. The behavior of the derived function is to index the list multi-dimensionally using successive elements of the index-vector. The main use of 'transit' is is in defining and using transition-matrix representations of finite state automata.
Example.
Consider the finite state machine B which computes the n-adic boolean or-function. B starts in state s. If it sees a 0 it goes to state f, and if it sees a 1 it goes to state t. If B is in state f and it sees a 0 it stays in f, else it goes to state t. If B is in state t it stays in t no matter what it sees.
To implement B, we construct a list of the three states, letting s be state 0, f be state 1, and t be state 2. In other words, the rows of B are the states, and the columns correspond to possible input values 0 and 1:
B:(1 2 1 2 2 2)0 B/0 1 0 1 2 0 B/0 0 0 10 B\0 0 0 0 0 1 1 1 1
And now in cK:
[[1 2][1 2][2 2]] `B set `B 0 [0 1 0 1] B transit 2 0 [0 0 0] B transit 1 0 [0 0 0 0] B Transit [0 1 1 1 1]
Transition-matrix iteration can also be used to perform lexical analysis of a string, i.e. tokenization.
Joy contains a pair of indexing primitives:
[1 2 3] 1 at 21 [1 2 3] of 2
'of' is the commute of 'at'.
K provides two indexing primitives. '@' takes an aggregate and a tree of indices and performs one-dimensional indexing. The result has the structure of the index tree. '.' takes an aggregate and a list of index-trees and performs multi-dimensional cross-product indexing. The axes of the aggregate correspond to the index trees in the list.
[10 20 30 40 50] [[0 1][0 2][3 2 1]] @ \ complex indexing structure [[10 20] [10 30] [40 30 20]]
[[10 20 30][40 50 60]][0 [2 1]] . \ row 0, cols 2 1 [30 20]
[[10 20 30][40 50 60]] [N [2 1]] . \ N means: all on an axis [[30 20] [60 50]][10 20 30] N @ \ all [10 20 30][10 20 30] [] . \ all [10 20 30]([`a 3 !:][`b 3 !: 10 +]) \ dictionary of vectors ([`a [0 1 2] N] [`b [10 11 12] N]) [`b [1 2]] . \ path to and into variable [11 12]
Indexing takes an aggregate and an index structure and returns some or all of the aggregate. Amend takes an aggregate, an index structure, and either a program or a program and further data, and returns the aggregate after evaluating the program (or the program and corresponding parts of the further data) to the aggregate at the positions specified by the index structure. Corresponding to the two indexing primitives '@' and '.' there are four cK operators:
amend3 A I [P] -> apply P to each of A I @, return A modified amend4 A I [P] B -> apply P to each of A I @ and parts of B, return A modified
dmend3 A I [P] -> apply P to each of A I ., return A modified dmend4 A I [P] B -> apply P to each of A I . and parts of B, return A modified
A few simple examples:
[10 20 30] 1 [1 +] amend3 \ increment element at 1 [10 21 30][10 20 30] 1 [+] 1 amend4 \ again, using amend4 [10 21 30][10 20 30] N [+] [1 2 3] amend4 \ increment corresponding elements [11 22 33][10 20 30] 1 [:] 99 amend4 \ replacement [10 99 30][0 2 5] N [!:] amend3 \ enumerate each element [I [0 1] [0 1 2 3 4]] N [|:] amend3 \ then reverse each of those [I [1 0] [4 3 2 1 0]]
A more interesting example is how to sort a matrix by its first row.
In K, we may do this explicitly, by first defining a matrix and then indexing it by the sort-order of its first row:
matrix:(2 1 3 3 4;10 20 30 40 50) matrix[;<matrix 0] (1 2 3 3 4 20 10 30 40 50)
The same procedure in cK:
[[2 1 3 3 4] [10 20 30 40 50]] `matrix set pop; matrix N matrix 0 @ <: ,: cons . [[1 2 3 3 4] [20 10 30 40 50]]
Alternatively, we may amend the matrix with the function @[;<matrix 0]:
@[matrix;_n;@[;<matrix 0]] (1 2 3 3 4 20 10 30 40 50)
The same procedure in cK:
matrix N matrix 0 at <: [@] cons amend3 [[1 2 3 3 4] [20 10 30 40 50]]
The interaction of indexing and data manipulation can yield some surprisingly powerful techniques. For a final example, let's revisit the algorithm for tracing paths through a tree.
Our tree, represented as a vector of parent-indices:
v:0 0 1 1 0 4 5 5 4 8 8 8
Compute the leaves: nodes which are not the parents of any nodes:
l:(!#v)_dvl v l 2 3 6 7 9 10 11
Now suppose that for each leaf we have some quantity:
q:@[&#v;l;:;(#l)_draw 100] q 0 0 77 35 0 0 49 87 0 82 79 8
Here we are amending a vector of zeros (&#v) at the leaf positions (l) using the replacement function (:) and a corresponding vector of random ints drawn from the range 0 to 99 ((#l)_draw 100).
We want to perform a "rollup" on those quantities: accumulate the partial sums at each non-leaf node, and the total at the root. So first we compute the paths from each node to the root
p:v\'!#v p (,0 1 0 2 1 0 3 1 0 4 0 5 4 0 6 5 4 0 7 5 4 0 8 4 0 9 8 4 0 10 8 4 0 11 8 4 0)
and then accumulate using amend:
t:@[&#q;p;+;q] t 417 112 77 35 305 136 49 87 169 82 79 8
Now let's replicate this in cK.
[0 0 1 1 0 4 5 5 4 8 8 8]`v set; v #: !: v dvl `l set; v #: &: l [:] l #: 100 draw amend4 `q set; v dup #: !: swap [Converge] left `p set; q #: &: p [+] q amend4 `t set; newstack t [417 112 77 35 305 136 49 87 169 82 79 8]
q == [[dup cons] dup cons]
is a quine*. Here's how it works:
q is a program, which in Joy is a list. The first element is the program [dup cons], the second is dup, the third is cons.
The program is pushed onto the stack, then executed with the i combinator. This combinator pops the top element -- the quine program -- and unquotes it, which causes the following series of actions:
- push [dup cons] onto the stack
- apply dup, leaving the stack with two elements [dup cons][dup cons]
- apply cons, leaving the stack with one element [[dup cons]dup cons], which is the original program.
We want a simple K version of this program.
Represent the stack as a list, where the first of the list is the top of the stack.
E is a function which takes a stack as its first argument, and something else as its second. If the something else is an atom, apply it to the stack, else push it onto the stack:
E:{:[@y;y x;(,y),x]}
Then we want three functions, each of which takes a stack and returns a stack:
d duplicates the stop of the stack:
d:{x[,0],x}
c "cons's" the top two elements of the stack:
c:{(x[,0],x 1),2_ x}
i evaluates the top of the stack on the rest of the stack:
i:{(1_ x)E/*x}
The Joy quine is the list
q:(`d`c;`d;`c)
which, when stacked (enlisted) and evaluated, returns itself:
i@,q (`d `c `d `c)
Now, by changing E/ to E\ in i, the evolution of the stack can be observed:
i:{(1_ x)E\*x}
i@,q (() / initial = rest ,`d `c / push `d`c (`d `c / dup `d `c) (`d `c / cons `d `c))
In fact, it's even simpler: we can dispense altogether with the i combinator:
()E/q (`d `c `d `c)
* This section antedates cK, and uses a different implementation of the stack. The Joy quine is due to Manfred von Thun.