A spreadsheet is a visual matrix of functional dependencies. A cell may contain data or a function definition. S, S+, and S- are K implementations of essential, extended, and minimal spreadsheet functionality.
By default, the S spreadsheet has 100 rows and 26 columns, labelled a through z:
A spreadsheet instance can be dynamically configured to have n rows and m columns, labelled C[0] through C[m-1], by assigning the variables N and C in the instance directory:
.S.N:10 .S.C:`first`second`third
We want b0 to be a0+1 and c0 to be a0+b0. So, into a0, enter 12; into b0, enter =a0+1; into c0, enter =a0+b0:
The foreground of a0 is black. b0 and c0 are blue, to indicate calculated values.
A spreadsheet keeps track of which quantities need to recalculate themselves when updates occur. For example, if we update a0, we should see b0 and c0 immediately recalculate:
If we now type 13 into b0 (a formula cell), we see that c0 recalculates, but b0 presents itself in red, to indicate that its value may be inconsistent, and needs to be recalculated:
For example, if we now type 14 into a0, b0 and c0 recalculate:
To clear a cell, type just the equality symbol. For example, type = into c0 to expunge the formula a0+b0. This severs the dependence of c0 on a0 and b0:
Vector expressions can be used in formulas. For example, enter a series of values into the a column, and then, in a2, enter the formula =+/a:
The formula is interpreted to mean: a0+a1+a3+...+a99. Observe that a2 knows to exclude itself from the set of summand cells.
Each spreadsheet cell knows how to evaluate K. For example, we can directly enter a K expression into a cell and have it execute, leaving behind its result as the cell value:
In our basic spreadsheet model, a label is a string:
Labels are "inert". They are rendered using a gray background.
A formula loop is trapped and signalled. For example, if we attempt to define a0 as b0-1, a "cycle!" error will be signalled to the spreadsheet:
Press the escape key to restore the previous value and clear the error.
A formula may consist of any valid K expression, including arbitrary functions and variables. Names beginning with a dot are evaluated over the K tree.
For example,
.my.fun:{x+y*z}
Now, in cell b4, enter:
=.my.fun[a0;a1;a2]
which produces:
A spreadsheet is an instance of the variable .S containing a table T with columns C and rows !N. T[c;r] is the value of cell cr, for c in C and r in !N. The value of the empty cell is 0.
A cell can be associated with a formula which computes its value. The formula is stored apart from the value, in the formula table X. The executable representation of the formula is stored in Y. Two tables Z and ZZ contain lists of cells derived from the formulas in X. Thus, Z[c;r] is a list of cells used by X[c;r]; that is, the cells on which T[c;r] directly depends. ZZ[c;r] contains a list of cells which directly use T[c;r]; that is, the cells whose formulas contain occurrences of cr.
X, Y, Z and ZZ are modified when a formula is entered into a cell. The formula is parsed, and information derived during parsing is stored in Y, Z, and ZZ. The formula itself is stored in X.
When an update is applied to cell cr, the spreadsheet is selectively invalidated. Selective invalidation marks only those cells which depend on the updated cell. ZZ[c;r] is processed recursively until cells are reached on which no cells depend. After all cells which depend on cr have been marked invalid, all marked cells are recalculated and restored to the valid state.
cr is invalidated by appending (c;r) to V, the list of invalid cells. cr is validated in two steps: first, replace T[c;r] with the result of executing Y[c;r], and then remove (c;r) from V.
When T[c;r] is cleared (by entering a naked equality symbol), the corresponding cells of the formula and dependency tables X, Y, and Z are cleared, and (c;r) is removed from each of the lists Z ./:Z[c;r]. That is, if T[c;r] depends on ds, then (c;r) is a member of ZZ[d;s]. In order to clear T[c;r], (c;r) must be removed from ZZ[d;s], the list of cells which depend on ds.
T[c;r] is displayed in one of three colors: black, if it is not a formula-cell; blue, if it is a formula-cell and is consistent with the cells on which it depends; and red, if it is a formula-cell and is not known to be consistent.
When T[c;r] is updated from the screen, the input string either begins with = or it does not. If the string begins with =, then either a formula is being defined or a cell is being cleared; in each case, W is set to (), the empty list. If the string does not begin with =, then a value is being entered, and W is set to (c;r). Thus, W records whether the last update was a formula definition (or clear) or a value update. If cr, the cell to be displayed, has no formula, then W is irrelevant: draw cr black. If cr has a formula, then draw it red if W~(c;r), else draw it blue.
The processing cycle:
Suppose there is screen input to T[c;r]. K calls the S parser T.c..g on the input value, which is always a string. The result of T.c..g is then assigned to T[c;r]. The S trigger T..t fires. Invalidation occurs, followed by validation. When T..t is finished executing, every cell in T is valid. Those cells in T which have changed are re-drawn to the screen. Each cell-draw involves a pair of K function-calls: one to T..f, to format the cell-value as a string, and one to T..fg, to draw the cell-foreground in the appropriate color.
S consists of two parts: .s, the directory of functions and data shared by all spreadsheet instances, and .S, the prototype spreadsheet instance, or "template". If a single, stand-alone instance of S is required, then .S itself will do. If many spreadsheets are needed, or the spreadsheet variable is to be part of some larger screen structure, then copies of .S can be made using K assignment:
new:.S new.:.S.
.S.T is a K table: a directory of equal-length lists. K programs can update spreadsheet cells by assigning values to variables in T. In this, the basic model, assignment is limited to cellular, that is atomic update. This restriction is lifted in S+ by extending .S.T..t, the trigger on T.
The heart of parsing in S consists of translating expressions over cell-names -- "formulas" -- into executable K. The rules are simple:
cr -> .s.get[_d;(c;r)] c -> if ~c=*K .s.get[_d;c] else (.s.get[_d]'(..;(c;i);..)) where ~i=*|K
The complicated business in the second rule ensures that a column can contain its own aggregation. For example, if T[`b;5] is defined to be +/a, then +/ is translated by the if-clause of the second rule, since *K is `b, not `a. But if T[`a;5] is defined to be +/a, then *K is `a, and the else clause of the rule is used, since we want to exclude T[`a;5] from the sum over column a.
Unless otherwise noted, x is the path to the spreadsheet instance, and y is a cell-index. For example, functions called under an update .S..T[`xyz;17] will see `.S for x, and (`xyz;17) for y. In what follows, I will often refer to T[`xyz;17] as y.
\d .s Function context: there is only ever one copy of this A:(_ci 97+!26;_ci 48+!10) *A defines the alphabet over which column names can be defined *|A is a digit list set:{:[eq1[x;y;z," "];def[x;y;1_ z];_n~r:. z;0;r]} z: string to validate/parse if z is a formula, define y else if r, the execution of z, is nil, return 0 else return r eq1:{.[x;`W;:;:[r:"="=*z;();y]];r} set W to () if z is a formula, else to y; return 1 if z is a formula, else 0 def:{:[""~z;clr[x;y;x .`Z,y];prs[x;y;z]];.[x;`V;,;,y];0} z: string to define if z is empty, clear y, else parse z; invalidate y; return 0 Observe that the value of an empty cell is 0. This allows us to define number-valued formulas over cells without values. Also note that formula-cells are empty until they have been validated. clr:{(`X`Y`Z,\:y).[x;;:;]'(;;());{.[x;`ZZ,z;_dv;y]}[;y]/[x;z]} z: cells on which y depends empty X, Y, and Z at positions corresponding to y; for each cell z[i] on which y depends, remove y from ZZ of z[i] prs:{(`X`Y,\:y).[x;;:;]'(z;,/spl[x;y]'(&1>':z _lin"._`\"",*A)_ z)} z: string to parse save z in X; save the raze of the split of each col-partition of z in Y A col-partitioning of a string cuts the string at each initial col-name character. A col-name character is in the alphabet *A, extended by "._`" and the quotation mark "\"". For example, the col-name partitions of "abc10+.myfoo `xyz" are ("abc10+";".myfoo ";"`xyz"). Calculating which partitions begins with a cell-name is deletgated to spl. spl:{:[~(*z)_in*A;z;,/@[(0,(z _lin,/A)?0)_ z;0;nam[x;y]]]} z: col partition if z does not begin with a character in *A, return z else cut z into cell name and a remainder; apply nam[x;y] to the cell name, raze and return the result nam:{:[""~*|r:(0,(z _lin*|A)?1)_ z;vec[x;y;*r];sca[x;y;r]]} z: cell name if the row-part of r is empty then generate the expression for getting all of the column else generate the expression for getting a cell sca:{".s.get[_d;",(5:dep[x;y;z]),"]"} z: ("col";"row") return a string which, when executed, returns the value of T[col;row] call dep first, to see if the formula cycles e.g.: the formula a12-3 is parsed ".s.get[_d;(`a12)]-3" vec:{"(.s.get[_d]'",(5:dep[x;y]'(z{(x;y)}/:$!x`N)_dv$y),")"} z: "col" return a string which, when executed, returns the value of T[col]; fabricate a cell reference for every row in T; call dep on each cell to see if the formula creates cycles e.g.: the formula +/a is parsed ".s.get[_d]'P", where P is ((`a;0);...;(`a;N-1))_dv y; i.e., the get of each cell except y, the cell defined by the formula dep:{cyc[x;y;r:(`;0)$z];.[x;`Z,y;,;,r];.[x;`ZZ,r;,;,y];r} z: ("col";"row") check for cycles on r, which (`col;row); if there are none, record the dependency of y on r in Z and ZZ; return r cyc:{if[y _in(,z),don[x;x .`Z,z];'"cycle!"]} z: (`col;row) signal an error if y is x or in the list of cells on which z depends don:{:[()~y;y;y,,/x _f/:x{x .`Z,y}/:y]} z:((c;r);...;(c;r) return the list of cells on which z depends act:{inv[x;y]'x .`ZZ,y;x get/:x`V} y: cell which has been assigned invalidate every cell which depends on y; validate every invalid cell inv:{if[~z _in y,x`V;.[x;`V;,;,z];_f[x;y,,z]'x .`ZZ,z]} z: list of cells which depend on y if z is valid: invalidate z recurse, invalidating every cell which depends on z get:{:[~y _in _dv/x`V`W;x .`T,y;dat[x;y]]} if y is valid, return its value (from T) else calculate its value dat:{r::[@x .`X,y;x .`T,y;. x .`Y,y];.[x;`T,y;:;r];.[x;`V;_dv;y];r} if y is not a formula, then r is T at y else y is the result of evaluating the formula for y; update T at y with r; remove y from the list of invalid cells return r fd0:{,(`N_;!x;.+(`e`f`l;(0;-8$;"")))} x: N generate N_ column, with attributes e, f, l fds:{(y;x#0;.+(`f`g`fg`bg;`.s.f`.s.g`.s.fg`.s.bg))} x: N y: column symbol generate a data column of 0's, with attributes f, g, fg, bg f:{:[x~0;8#" ";$x]} format: if x is 0, 8 blanks, else default format g:{set[dr _d;cl[_v;_i];x]} validate: update this cell in the current instance with x fg:{:[@(d:dr _d).`X,i:cl[_v;_i];0;i~d`W;990000;99]} foreground color: if this cell in the current instance is not a formula, black else if this formula-cell has been updated manually, red else blue bg:{:[-3=4:x;808080]} background color: if this cell in the current instance is a string, gray cl:{(`$1_*|(&"."=v)_ v:$x;*y)} x: `. ... instance.table.color y: index into x return canonical cell id (`col;row). e.g., if x is `.S.T.b and y is ,3 then return (`b;3) dr:{`$(- 1+(|v)?".")_ v:$x} x: `. ...instance.table return instance `. ... instance. e.g., if x is `.S.T then return `.S \d .S .S is a typical spreadsheet instance, or template. New instances can be defined through simple assignment: new:.S new.:.S C:`$/:*.s.A the symbols `a`b`c ... to be used as column names in T N:100 the number of rows in T T..d:"..s.fd0[N],N .s.fds/:C" T depends on C and N. T has 1+#C columns, each of which has N rows. Each column of T is a list-variable. The initial column is N_, which exists for the sole purpose of enumerating the rows. This column is treated as pure decoration: input is not allowed, and the label is turned off. For each of the remaining columns, four attributes are defined: a format function f, a validation function g, a foreground color function fg, and a background color function bg. The functions fd0 and fds generate, respectively, the row enumeration column N_ and the data columns C. T..t:".s.act[_d;_i];" an update to any variable in T will cause .s.act to be called. X..d:Y..d:".+(C;(#C)#,N#_n)" Z..d:ZZ..d:".+(C;(#C)#,N#,())" The four variables X, Y, Z, and ZZ are defined as tables which are structurally identical to T. X- and Y- cells are initialized to _n, to simplify testing for formula-cellishness. Z- and ZZ- cells are initialized to (), since the cells of these tables are cell-lists. V..d:W..d:"C;N;()" The two variables V and W are initialized to (). V is a cell- list; W is a cell. \d ~ the attribute dictionary of .S c:`form .S is a form a:`T .S's arrangement is the table T x:95 y:30 .S's initial size
This section describes the S+ spreadsheet, which extends the core model S by implementing the following ten features:
- Defined display: the user can specify the format function of a cell
- Defined foreground: the user can specify the foreground color of a cell
- Defined background: the user can specify the background color of a cell
- Defined edit: the user can specify whether a cell is editable
- Defined conversion: the user can specify the conversion function of a cell
- Attributes can be replicated and cleared
- Row- and column-ranges can be specified in a formula
- Formulas can be viewed and edited through an alternate input field
- Recalculation of cells can be immediate or postponed
- All forms of assignment to T are supported
In S, there is one command letter, =, which defines (or clears) a formula. In S+, the command letters are = (or C), D, E, F, B, and G. A command is either a command letter or a command letter followed by either a column name or a row number. An empty command is the instruction to clear the corresponding attribute. For example,
= Clear the current cell formula D9 Clear the formats in all cells from here to row 9
D 7.3$ display width 7, 3 digits of precision G {:[0<r:. x;r;'"must be >0"]} convert if >0 else signal error E.1 editable F 9900 foreground color = green B {:[x<0;.red;.blue]} background = red if negative, blue if not
Attributes (D,E,F,B,G) are absolute. A replicated attribute has the same meaning in every cell in which it appears: green is the same green everywhere. The reference of a formula can be absolute or relative. For example:
=c a1+2 replicate a1+2 from here to column c =c $1 +2 replicate a1+2, b1+2, c1+2 from here to column c
If the current column is a and the subsequent columns are b and c, then the second formula -- a formula-scheme -- will produce three distinct formulas: a1+2 in the current cell, b1+2 in the b cell, and c1+2 in the c cell. The meaning of a formula-scheme is location-dependent. The dollar sign is schematic for the name of the current column, or in the case of row-wise replication, for the current row:
=12 a$+3 replicate a$+3 from here to row 12
Attributes can also be replicated:
F17 99 blue foreground from here to row
Finally, values can be replicated (as formulas without cell-dependencies):
=d 17 replicate 17 from here to column d
Vector operations in the basic model apply only over whole columns. For example,
= +/a sum of a
To define the current cell as the sum of a0 through a5 times the sum of a6 through a12:
= (+/a0:a5)*+/a6:a12 product of sums of row-ranges
: means "to". Alternatively,
= (+/a0:d0)*+/a0:k0 products of sums of column ranges
Ranges can be used within formula schemes. For example, if the current cell is a0, then:
=5 +/b$:k$ +/b0:k0, +/b1:k1, ... ,+/b5:k5
The translations are extended to handle ranges:
cr:cs -> (.s.get[_d]'((c;r);...;(c;s))) cr:dr -> (.s.get[_d]'(c;r);...;(d;r)))
An alternate-input field I has been provided for several reasons. First, the formula-definition of a cell can be displayed and edited. Second, the canonical form of a cell-value can be displayed using 5:. Third, label-valued cells can be edited (since they are always output-only in the extended version). Updates to the alternate input field are applied to the current cell, the index of which stored in the variable K.
In the basic model, changing the value of a cell on which other cells depend causes the immediate recalculation of the dependent cells. There are circumstances in which immediate recalculation is expensive (if the calculations are costly) or misleading (if the intermediate states of the calculation are meaningless). Immediate recalculation also means that, if a spreadsheet is updated from a program source (e.g. a real-time feed), then it cannot be used interactively. A recalculation-checkbutton has therefore been added. With recalculation on, invalid cells are immediately validated. With recalculation off, the list of invalid cells accumulates until the switch is turned back on, at which point all invalid cells are recalculated. The state of being invalid-and-displayed (which cannot occur if calculation is on) is signalled by graying-out the background of the cell. Thus, for example, if c0=a0+b0, and recalculation is on, then an update to a0 will cause c0 to recalculate, as will an update to b0. With recalculation off, c0 will be added (once) to the invalid list, and updates to a0 and b0 will have no effect on c0's value until recalculation is turned back on.
In the basic model, the only form of programmatic assignment to T is cellular (atomic); for example, .T.a[3]:101. In the extended model, the trigger function has been designed to detect and respond to any form of indexed assignment.
The extended model consists of a form .S with four components: the Calculate button, which controls recalculation; Q, the current contents of the selected cell; I, the alternate input field; and T, the spreadsheet itself.
Enter 1 to 20 into columns a0 to d4 programmatically:
.S.T[`a`b`c`d;!5]:+1+5 4#!20
Define column e as the row-sums, row 5 as the column-sums:
in e0: =4 +/a$:d$ in a5: =e +/$0:$4
Turn off Calculate, update a0 and c3; a5, c5, e0, e3, and e5 are invalid and pending.
Turn Calculate back on; pending invalidities are validated:
The S- spreadsheet may be the minimal K implementation (Arthur Whitney, 1995).
The code verges on the trivial:
S..t:".[`D;(;);{. y};S[]];S[.;`f]:9$D[]" S:D:.+(`a`b`c`d;4 6#,"")
(Foreground, label, and click attributes have been added to the version distributed with this paper.)
Here's how it works:
Formulas are entered without a command letter, but in K notation. For example, the formula a0+b22 is defined: a[0]+b[22].
Labels are entered as in S/S+, as strings: "abc".
Input is not processed; there is no parsing in S-.
S is a table of strings on the screen. D is a table of values.
For example, if a0 is 12 and b0 is the formula a[0]+1, then S.a[0] is "12" and S.b[0] is "a[0]+1", and D.a[0] is 12 and D.b[0] is 13.
and the value 2 is entered in S[`a;0]. There is no g function, so "2" is accepted and assigned to S[`a;0]. The trigger S..t fires and performs two pieces of work. First, S is re-evaluated, element-by-element, in the directory D, and the result of each evaluation is stored in the corresponding position of D. The value of D[`a;0] is now 2, and that of D[`b;0] is 3. Next, the f attribute of each variable in S is assigned the formatted values in D. After the trigger completes, the contents of f are written to the screen.
First, download K.
Copy the scripts for the three spreadsheet programs to a directory s.
In a command shell, type
k s/s
or
k s/s_plus
depending on whether you want to use the basic K spreadsheet S or the extended version S+.
Once in the K session, turn error-trapping on:
\e 0
Now make an instance of .S, the spreadsheet template:
mysheet:.S mysheet.:.S.
Then display the spreadsheet instance:
`show$`mysheet