Computer Science Technical Report 95-10

LIBRA: A Lazy Interpreter of Binary Relational Algebra

Barry Dwyer
Department of Computer Science
University of Adelaide
G.P.O. Box 498, Adelaide
South Australia 5005
October 30, 1995.


LIBRA is a general-purpose programming language based on the algebra of binary relations. It is an attempt to unify functional and logic programming, retaining the advantages of both, and avoiding some of the problems. It has all the features needed of a programming language, including a straightforward semantic interpretation. Since program specifications are easily expressed as relations, it offers a simple path from a specification to a program and from the program to its proof of correctness. The algebra of binary relations has several operators whose effects are like those of familiar procedural language constructs, for example, relational composition is analogous to sequential execution.

The syntax and semantics of the Libra language are given in detail. The Prolog implementation of a lazy evaluator for Libra programs is described. Libra is illustrated by its application to some simple programming exercises.


PROGRAMMING Languages, Binary Relations, Specification Languages, Prolog.

1. Introduction

1.1 What is a Binary Relation?

IN MATHEMATICS, the expression `X r Y' is true if X and Y satisfy the relation `r'. For example `X<Y' is true if X and Y satisfy the relation `<'. There are three ways the `<' relation can be considered: as the (infinite) set of all (X,Y) pairs for which X<Y; a predicate that can be applied to (X,Y) pairs; or as a `relator' [11, 12], that given X, will yield all Y values greater than X.

In Libra, a (finite) relation to describe changes in temperature could be defined as follows:

let transition->{'Cold','Warm';'Warm','Hot';'Hot','Warm';'Warm','Cold'}.
The braces enclose a set of 4 elements, separated by semicolons. Each element is an ordered pair of terms, separated by a comma. The relation is given the name `transition'. The two means used in this example--set formation and pair formation--are the only ways of building data structures in Libra. The members of a set are unordered, but pairs are ordered.

There are three ways this relation could be used:

  1. It could generate the four pairs of values.
  2. It could be used to test whether a pair of values satisfy the relation.
  3. Given the first term of a pair or pairs, it could give the corresponding second terms as a result. For example, given 'Warm', it could yield both 'Cold' and 'Hot'.
The first two ways of looking at relations are shared by sets in general. We may enumerate their members, or test elements for membership of them. The third view puts the relation in a more active role, and is described as `applying' the relation to an argument to yield one or more values as a result. This is analogous to the view we take of applying functions in a functional programming language.

Libra distinguishes these three roles syntactically. To generate the members of the set, we would write:

find @transition.
(The `find' command enumerates the values of an expression.) Here, the `@' operator generates each pair in the set, although not necessarily in the order shown. It may be read as `all' or `for all', depending on the context. In the existing Prolog-based interpreter, these pairs are enumerated by back-tracking, but in principle, Libra allows them to be generated in parallel. In general, we may imagine that the computation splits into several independent threads--in this case, four of them.

To test set membership, we might write:

find ('Warm','Hot')?transition.
find ('Cold','Hot')?transition.
The `?' operator may be read as `is a member of' and corresponds to epsilon, the usual mathematical symbol for set membership.

To apply a relation to an argument, we write:

find 'Warm'!transition.
where `!' is read as `apply'. (It has no connection with Prolog's `cut' operator.) There is an important distinction to make here. Applying a relation to an argument `enumerates' a set of result values, splitting this computation into two threads. It does not yield the values as the single set, {'Hot'; 'Cold'}--which can actually be generated by Libra's `image' operator:
find {'Warm'} image transition.
{'Cold'; 'Hot'}
This means that applying relations that yield several values is computation intensive rather than storage intensive, although, as the `image' operator illustrates, the programmer has the means to trade space for time if desired. The distinction between the members of a set and the set itself is a subtle one, and the reader should take careful note of it. Finally, it is worth mentioning that Libra does not support a fourth mode of using relations: given the second term of a pair, to determine its corresponding first terms--at least not directly. Accordingly, the first term of a pair is called its argument, and the second term is called its value. It is possible to go from argument to value, but, in general, not from value to argument--any more that existing programming languages allow inputs to be derived from outputs. This is stressed by the alternative notation for pairs: using `->':
let transition ->
from which it will be seen that the name `transition' and the relation it defines are also an ordered pair.

1.2 Why Binary Relations?

WHY SHOULD a programming language be based on the algebra of binary relations?

One view is that it is an attempt to combine the advantages of both functional programming and logic programming. An often stated advantage of functional programming is `referential transparency': the idea that a program is an algebraic expression, which when simplified, yields the value of the program, ie., its output. An aspect of this idea is that functions can be given names, and that any occurrence of the name can be replaced by the corresponding function. A difficulty that besets functional programming languages is that a function always has exactly one value. This makes it hard to deal with exceptions: dividing a number by zero yields no result, yet finding the square root of a number yields two results. This difficulty is avoided in a logic language such as Prolog, which can produce no result by `failing', or produce two results by back-tracking. On the other hand, Prolog has its own difficulties. Although a subset of Prolog can be understood in terms of predicate calculus, most useful Prolog programs need to use extra-logical predicates, which can only be understood with reference to a specific model of program execution. A language based on the algebra of binary relations can combine the best aspects of both approaches, while avoiding some of their problems.

A second view is that relations are to general-purpose programming languages what matrices are to scientific languages. They are large scale data structures, which can be manipulated as wholes. In the scientific field, matrices have led to the evolution of specialised parallel processing computer architectures such as vector processors, but there has been no similar well-established concept that has formed the basis of parallel architectures in more diverse fields--unless one counts the n-ary relations used in databases. Perhaps binary relations will provide such a concept. (One might consider that the original Connection Machine [4] was based on similar ideas, but failed to exploit the full power of binary relational algebra.) It is probable that the ideas in this paper will need additional refinement before this goal can be achieved, perhaps by amalgamating them with the related notion of `conceptual graphs' [12].

A third view justifying using binary relational algebra is that it contains a rich supply of useful operators that closely match the things that programmers do. (This sets it apart from the relational algebra of tuples used in database query languages, which has but a few general-purpose operators: `select', `project', and `join'.) For example, an exercise often given to student programmers is, given a list of words, to display for each word a list of all its positions in the list. In the algebra binary relations, if `W' is the list of words, the result is essentially given by the expression `W-1'.

A fourth justification is that relations are very general mathematical objects, which include functions as special cases. They are like multi-valued functions, and can do anything that functions can. They are also known to be a universal model for data representation, being the building blocks of relational databases. In Libra, there is no distinction between a relation as data or a relation as an operator, except how it is used. The syntax of Libra is such that, the limitations of the ASCII character set aside, any legal program is a valid expression in the algebra of binary relations, which in turn specifies the program's behaviour. This gives Libra a flavour similar to the specification language, `Z'. Although Libra is not an attempt to imitate `Z', it does mean that it is easy to prove the correctness of a Libra program derived from a `Z' specification, using only the tools of discrete mathematics.

Finally, lest the reader assume that Libra is only suitable for solving abstruse problems in discrete mathematics, before reading on, it may help first to skim Section 8, which explains some perfectly ordinary example programs.

The proof of any pudding is in the eating, so Libra is best tested by personal experience. The implementation of Libra described here is portable, in that it uses core Prolog almost exclusively, and should prove easy to adapt to different Prolog environments. (It was originally developed using Open Prolog [1].) The interpreter allows readers the chance to experiment. It is acknowledged that the interpreter is slow, but this is a property of the implementation, not an inherent property of relational languages.

1.3 Some Previous Work

The Libra language owes its inspiration to Sanderson's Relator Calculus [10, 11], which first demonstrated to the author the power of relational languages. However, there is a fundamental shift of viewpoint from that work. Sanderson regarded programming constructs such as `if...else' and `' as having proved their worth (Wirth?), and based the relator calculus on similar constructs. The disadvantage of this is that their mathematical properties are not as clean as those of the roughly similar concepts of relational union and closure, except when the program is deterministic (ie., functional). (Even so, there are situations where something closer to `if...else' or `' is needed, which is why Libra offers the `else' and limit (^^) operators.) The Relator Calculus [10, 11] is an algebra rather than a programming language, although an experimental interpreter for it has been implemented.

Another source of inspiration is the work of MacLennan [5, 6], who demonstrated several examples of relational programs, using the language RPL. RPL was really a functional language that could operate on relational data structures. Its control structure did not exploit backtracking or parallelism. It also had the blemish that different operators had to be written to denote the same operation depending on whether it was applied to relations expressed as data (extensional relations) or as program structures (intensional relations). For example, the union of two sets and the union of two programs were formed by different operators. It was also difficult to mix operations on intensional and extensional structures. This can be defended on the grounds that a programmer should be aware of the distinction between data and executable instructions, but it makes an RPL program less abstract, and harder to read.

The defects of RPL were addressed by Drusilla [2], which allowed the same operator to be used uniformly on intensional or extensional relations, or a mixture of both. This is really a radical point. It would be one thing to devise a language that supported operations on matrices; it would be quite another to devise a language where the programs themselves were matrices that could be combined using matrix operations. Likewise, programs that are structured using relational operations are not like programs in other languages.

Drusilla also distinguishes the three modes of using relations identified earlier: as sets of pairs, as predicates that can be applied to pairs, and as means of constructing results from an argument. The Drusilla language system includes a compiler that uses an extension of Milner type checking [7] to deduce the correct ways of combining relations from their properties. This process also performs conventional type checking, detecting potential programming errors. In this respect, Drusilla is superior to Libra, which leaves what little type checking it does until execution time. On the other hand, Libra allows operators to be over-loaded, so that, for example, the programmer can extend the built-in `+' operator, which applies only to integers, to apply to vectors and matrices. This kind of overloading is not possible with Milner type checking, because the types of operands and results are deduced from the types of operators, which therefore have to be fixed.

The `Z' specification language [8] has been another influence on Libra. Libra includes many constructs found in `Z', although it is certainly not an attempt to turn `Z' into a programming language. Rather, it is a language into which it should be easy to convert a `Z' specification. Alternatively, Libra is itself sufficiently close to mathematical notation that it can serve as its own specification language. A specification can be written in Libra. Although it might not be an executable program, it should be possible to turn it into one by using the ordinary theorems of discrete mathematics.

The immediate precursors to Libra are a language proposal [3] written by the author--while in ignorance of Drusilla's existence--and Hydra [9], a partial implementation of the proposal. Like Drusilla, the language proposal addressed the defects of MacLennan's RPL, but adopted different solutions from Drusilla in almost every case. This is probably because Drusilla has a functional programming background, whereas Libra is based on logic programming. Drusilla is implemented using Miranda; Libra is implemented using Prolog. Both Drusilla and Libra attempt to meld the functional and logic styles, but are towards opposite ends of a spectrum. Hydra was influenced by Drusilla, incorporating a similar type checking system, and revealed several problems inherent in the original proposal. However, only a small subset of the proposed language was implemented, and Hydra is not currently useable.

There are several differences between Libra and Drusilla. Some result from the means of implementation. For example, Drusilla explores the alternative results of relations by lazy evaluation of lists, whereas Libra uses a mixture of lazy evaluation and backtracking. Other differences are experiments--if Drusilla does it one way, Libra usually does it another. For example, as already mentioned, Libra uses dynamic type checking.

An important difference between Drusilla and Libra is in the treatment of iteration. Drusilla simulates iteration by recursion, which is to be expected of a language based on functional programming; whereas Libra uses relational closure. (The fact that closure is implemented internally by recursion is irrelevant; what matters is the effect on the source program.) A typical Libra program is free of recursive definitions, making its structure much more transparent. (Recursion can simulate `go to' statements and, badly used, can make a program just as hard to understand as they do.) Some consideration was given to banning recursion in Libra altogether. It remains available--in case of emergency.

Another difference is that Libra uses sets and pairs as its basic structuring operations, whereas Drusilla uses lists and tuples. The author considers that Libra's approach is much more appropriate to a relational language. Compared with lists, sets have advantages and disadvantages. The elements of sets have no particular order, which means that they can be dealt with in parallel, at least in principle. (The current implementation uses back-tracking to simulate parallelism.) A consequence is that operations on different set elements cannot possibly interact with one another, simplifying understanding of the program. A disadvantage of sets is that they are not as easily implemented as lists are--although they have several efficient implementations, such as trees and hash tables, that allow set membership to be tested with lower computational complexity than is possible for testing membership of lists. (Unfortunately, the current implementation of Libra is badly deficient in this respect, because it represents sets as lists. Correcting this is a top priority in a later version of Libra.) Lists exist in Libra, where they are called sequences, but they are implemented as functions from the integers 1 to N onto their terms. Given a tree or hash table implementation of sets, efficient access would be possible to any term of a sequence.

A Libra feature worthy of note is reduction, which enables the elements of a set or sequence to be combined under a suitable operation. For example, reduction under `+' forms the total of a set of numbers. A special case of reduction is to choose an arbitrary member of a set. This is useful in problems that admit of many solutions, but where only one solution is needed. (It also subsumes one of the functions of Prolog's `cut' operator.)

Libra is intended to encourage the development of relational programming languages by allowing further experiment. To this end, the current interpreter is designed for simplicity and ease of extension and modification. Even so, it incorporates much useful knowledge, being essentially an expert system for evaluating relational expressions.

2. The Libra Language

2.1 Basic Syntax

THE SYNTAX of Libra is based on that of Prolog. Given suitable operator precedences and associativities, a Libra program is simply a Prolog term. Its lowest level building blocks are variables, integers, atoms, delimiters and operators. Variables and integers are defined exactly as in Prolog. Variables must begin with a capital letter or underscore character, and may comprise letters, underscores or digits.

There are two kinds of atoms. Those that begin with capital letters, such as "'Warm'" are called `literals', and simply stand for themselves. (They must be enclosed in quotes to distinguish them from variables.) They are used to give meaningful names to values. The only pre-defined literals are the boolean values 'True' and 'False', but the programmer can invent new ones, such as 'Cold', 'Warm' and 'Hot'.

All other atoms are called `names'. There are three classes of names: strings beginning with a small letter and comprising letters, underscores or digits, strings of the symbols +-*/\^<>=:.?@#$&, and any string enclosed in quotes that is not a literal. A name always stands for an expression defined by a `let' statement. For example, `transition' was defined earlier as a name that stands for a set of four pairs. The basic rule is that where a name appears, it can always be replaced by the thing it stands for.

Because Libra is based on relations rather than functions, it seems reasonable to let a name map to more than one value if desired. This lets the meanings of names be overloaded; for example, the built-in `+' relation applies only to integers, but the programmer may easily define it to apply to vectors and matrices as well.

Operators are a notational convenience. Almost any name can be defined as an operator. The arrangements for doing this are similar to those of Prolog. The same name may be both a unary and a binary operator. The expression `1+2' is indistinguishable from the expression `+(1,2)'. Both are interpreted as relational application, ie., as `(1,2)!(+)'. It may seem strange to treat `+' as a relation rather than a function, but a function is merely a special case of a relation.

2.2 User Commands

Libra is activated within a Prolog environment by consulting the file `libra'. Libra declares many operators that are not found in Prolog itself, so that although the programmer is really using Prolog, the Libra programming environment seems quite different.

The `find' command evaluates an expression:

find 1+2.
If there are several results, `find' displays them all:
find @{0;2}+ @{0;1}.
If the expression has no value because it contains an inapplicable operator, nothing is displayed:
find "abc"+2.
The `?' command is an alternative to `find':
Usually, an expression will use definitions that have been created by an earlier `let' statement:
let x->1.
let x->"abc".
let y->2.
let y->4.
The word `let' is optional:
A `let' statement maps a name onto an expression. Often the expression is a relation, but this is not a requirement. The same name may map onto several expressions. If so, all possible substitutions are made. It is possible to overload the name of any Libra relation, command or operator, with one exception, `drop'. (This ensures that if overloading a name causes confusion, the definition can be deleted.)

However, there is a penalty. Libra makes checks on the operands of built-in relations, so that for example, if the operands of `+' are not integers, a warning message results. However, if the programmer overloads the `+' operator, Libra must turn off the warning, otherwise it would complain about the calls on the programmer's own definition. (One might argue that it should first test to see whether the new definition proved to be inapplicable--but in some cases the new definition might be intended to be inapplicable!) As a result, overloading Libra's built-in names is better left until after a program has been debugged. (The converse of this is that the programmer can suppress Libra's warning messages if desired.)

The `show' command displays the current definitions associated with a name:

show x.
x -> 1 .
x -> [97,98,99] .
If no name is specified, `show' displays all the current definitions:
x -> 1 .
x -> [97,98,99] .
y -> 2 .
y -> 4 .
The format displayed by `show' may not agree exactly with what was typed. Among other things, variables are replaced by `A', `B', `C', and so on, and parentheses are introduced to make operator precedences explicit:
show add1.
add1 -> {(A,(A + 1))} .
Definitions remain in force until they are dropped:
drop x.
find x+y.
produces no output. The form:
drops all definitions and wipes the slate clean. The definitions associated with a name can be dropped selectively using the `edit' command:
edit x.
x -> 1 .
     Keep it (+) or drop it (-)? +
x -> [97,98,99] .
     Keep it (+) or drop it (-)? -
show x.
x -> 1 .
The programmer may add new operators to the language. There are seven commands for this purpose, depending on the properties of the operator required:
  • fx non-associative unary prefix
  • fy associative unary prefix
  • xf non-associative unary postfix
  • yf associative unary postfix
  • xfx non-associative binary infix
  • xfy right-associative binary infix
  • yfx left-associative binary infix
(These conventions are the same as in Prolog.) The same name may be declared as an operator twice: once as a unary operator, and once as a binary operator. Attempting to declare a name as a unary or a binary operator for the second time drops the existing definition. (This restriction is imposed to make the parsing of Libra programs unambiguous, and is inherited from Prolog. It does not prevent the same operator having several overloaded meanings.

An operator's precedence can be specified by any expression:

<+ yf (binary_prec(*)+unary_prec(+))/2.
which declares `<+' as an associative postfix operator with precedence midway between that of the binary `*' and unary `+' operators. (The built-in functions `unary_prec' and `binary_prec' are very special because their operands are not interpreted. In any other context, an attempt is always made to interpret names by replacing them with their definitions.) For portability reasons, the programmer should use expressions that give operators relative precedences, rather than absolute ones. The `show' command displays operator definitions too:
<+ yf binary_prec(*)+unary_prec(+))/2.
show <+ .
<+ yf 450.
Programs may be typed interactively, but it is usually easier to edit them as separate files (e.g., using the Prolog editor), then load them with the `use' command:
use farmer.
Reading farmer ...
A file used in this way is treated exactly as if the same commands had been entered from the keyboard, and it may contain any sequence of commands whatsoever. Because Libra is a relational language, it allows names to have multiple definitions. If a `use' file is corrected and `used' a second time, its old (erroneous?) definitions will remain active until they are dropped. The `reuse' command is equivalent to `drop' followed by `use'.
reuse farmer.
Reading farmer ...
(This drops all definitions, not only those in the file `farmer'. In general, it would be unwise to use the `reuse' command within a program file.) The `dump' command is the converse of `use'; it writes all existing definitions to a file. The text of the file is the same as would be displayed by `show'. Since the variable names in the saved file are meaningless, this command is of limited use.

Finally, `commands.' lists all Libra's commands, and `help X.' describes the purpose of the command `X' in more detail.

2.3 Structures

LIBRA'S basic structuring operations are set formation and pair formation.

Sets are written as lists of elements between braces `{}', separated by semicolons. The order in which the members are written is unimportant, and duplicate elements are ignored, e.g, `{1;2;2;3}' and `{2;3;1}' both denote the set containing the integers 1, 2, and 3. The members of sets may be structured, e.g., `{(1,2);{1,2}}'. A set may contain members of different kinds, e.g., `{1;{};'True'}'.

A special shorthand is provided for ranges of integers: {M..N} denotes the set containing the range of integers M to N inclusive:

In contrast to `extensional' sets whose members are listed explicitly, it is also possible to define an `intensional' set by a means of a pattern and a predicate. For example:
denotes the set of all (X,Y) pairs such that X is less than Y, ie., it is the same as the relation `<' itself. (In this context, `:' should be read as `such that'.) There is an important restriction on a set defined in this way. It is possible to test an element for membership of it, but it is not possible to enumerate its members. Apart from the task of generating all (X,Y) pairs such that X<Y being never-ending, Libra does not have the knowledge to deduce what set is implied by a pattern and a predicate.

Such sets are called `filters' to distinguish them from those whose members are listed explicitly, which are called `generators'. Enumerating the members of a set (using the `@' operator) is described as `using the set as a generator'. Testing an element for membership of a set is described as `using the set as a filter'. A generator can be used as a filter, but a filter can't be used as a generator.

(These two modes of use sometimes have subtle consequences. In Libra, `AB', the intersection of sets `A' and `B', denoted by `A meet B', is found by generating the members of `A', then testing to see which are also members of `B'. Thus `A' must be a generator, but `B' may be either a generator or a filter. Although `AB' and `BA' denote the same mathematical object, `A meet B' and `B meet A' denote different Libra programs. Consequently, if a specification called for the value of `BA', it might be necessary to first transform it to `AB' to make the program workable.

Filters are often used in a `generate and test' strategy where it is desired to test given values for set membership. Thus, by defining the set of odd numbers:

odd -> {X:X mod 2=1}.
the command:
? {1..100} meet odd.
yields the set of all odd numbers between 1 and 100. Ordered pairs are defined by an argument and a value. The pair:
maps the input `Arg' onto the output `Value'. Pairs are formed by the infix operators `,' and `->', which differ only in that `->' binds its operands more loosely than `,'. Having the choice of two operators often saves using parentheses. Within Libra however, `->' and `,' are treated identically.

It is sometimes possible to treat pairs as tuples, written as a list of terms within parentheses separated by commas. Because `,' is a right-associative binary operator, the tuple `(1,2,3)' is indistinguishable from `(1,(2,3))', `(1->2,3)' or `(1->2->3)'. However, all four are distinct from `(1,2->3)', ie., `((1,2),3)'.

There is no such thing as the empty tuple; `()' is a syntax error. A 1-tuple is indistinguishable from its only element, ie., `(1)' is the same as `1'. However, when operators are used as operands, it is sometimes necessary to enclose them in parentheses, e.g. `(+)', to ensure correct parsing.

2.4 Binary Relations

Binary relations are merely sets of pairs. For example:
Defines a relation that swaps a pair of values. It is possible to use this relation as a filter to test if one pair is the reverse of another, for example:
? ((1,2),(2,1))?swap.
It is not possible to use `swap' as a generator. However, it allows a third mode of use: as a relator [10, 11] or `constructor'. `Swap' can be applied to the pair (1,2) to construct the pair (2,1) as follows:
? (1,2)!swap.
It is possible to write what appears to be a ternary relation, e.g:
But the associativity of the comma is such that this is identical to:
which is really a binary relation. This means that ternary, quaternary relations and so on do not exist in Libra, although it is usually harmless to pretend that they do. Since the second term in `ternary' introduces new variables, it is a filter. A relation may be non-deterministic, in that an argument in its domain may map onto more than one value in its codomain, e.g.,
transition ->
         {'Cold','Warm'; 'Warm','Hot'; 'Hot','Warm'; 'Warm','Cold'}.
? 'Warm'!transition.
The pairs comprising a relation may have different types, e.g.:
        1001->{'Name',"John Citizen";'Age',25};
        1002->{'Name',"Jane Doe";'Age',40}
is a set of two records.
? 1001!staff.
{'Name',"John Citizen";'Age',25}
? 'Name'!(1001!staff).
"John Citizen"
However, it is debatable whether records have a useful place in a relational language. An alternative is to define two simpler relations, as follows:
name->{1001->"John Citizen"; 1002->"Jane Doe"}.
age ->{1001->25; 1002->40}.

2.5 The Ranks of Relations

There is a hierarchy of relations in Libra: `generators', `constructors', and `filters'. As we have just seen, a constructor may be used as a constructor or used as a filter. A generator may be used as a generator, a constructor or a filter. A filter may be used only as a filter. The distinction is made as follows. A generator contains no variables, but constructors and filters do. In a constructor, only the first term of a pair may introduce new variable names, which may be reused in the second term. A filter contains a pair whose second term introduces new variables. If a relation contains several pairs, its lowest ranking pair determines the rank of the relation.

(This classification is a little simplistic. The terms of relations may themselves be sets or relations defined in terms of variables. For the purposes of classification, variables defined within these sets or relations may be ignored, and the sets or relations treated as constants. Thus:

let switch ->{1->{X:X>0}; 2->0}.
find 1!switch.
{(A : (A > 0))}
find 2!switch.
defines a relation that pairs 1 with a filter but pairs 2 with 0. It is a generator. However, a relation may not yield an unbound variable except as part of a set definition. For example:
switch ->{1->X; 2->0}.
is dangerous. The expression `switch(1)' would yield `X', which is considered an error. A frequent way to define relations is by a series of cases. For example:
max->{X,Y->X:X>Y; X,Y->Y:Y>X}.
defines a relation that finds the larger of a pair of values, if there is one. In such a context, `:' is best read as `if and only if'. There are no parameters associated with the name of a relation; each pair in the definition carries its own argument pattern. Thus the previous definition of `max' could also be written:
max -> {X,Y->X:X>Y; A,B->B:B>A}
In addition to the means of definition given so far, which are simply those for sets, constructors also allow the second term of a pair to be defined by an expression. For example:
defines `succ' as a relation that adds one to its argument. It may be used as a constructor or a filter. For example:
? 3!succ.
? (5,6)?succ.
Variables in constructors and filters are unified with input arguments (in the Prolog sense). To match a pattern of the form `(X,Y)', the argument must also be a pair. However, the triple `(1,2,3)' will unify with (X,Y) successfully, giving X=1, Y=(2,3), because `(1,2,3)' is really `(1,(2,3))'.

A constructor may not introduce new variables on its right hand side. For example:

defines `confused' as a relation that, given two pairs, applies if the first term of the second pair equals the sum of the terms in the first pair. This is not a constructor, because there is no way to derive Z from X and Y. It is a filter, and must be written as one:
(This restriction results from the way Libra does pattern matching, but seems to be harmless.) Although `->' and `,' have the same mathematical meaning, there is an important difference between them. Each term of a Libra generator must be written in the form:
pattern -> expression
or the form:
pattern -> expression : predicate 
where the `->' cannot, in general, be replaced by a comma. The pattern may consist only of variables, perhaps structured as pairs or tuples. An expression to the right of `->' or a predicate may be general expressions--but should not introduce new variables. Thus, although:
has the same meaning as the preceding definition of `confused', its definition will cause two warnings, because `Sum' and `Z' first appear to the right of `->'. Conversely, the definition:
will cause a warning, because `X+Y' is not to the right of `->'.

2.6 Sequences

`Sequences' are special relations whose arguments occupy the range 1..n, for some n>=0. Sequences are useful enough to deserve a special shorthand. They may be written as lists of terms separated by commas, and enclosed in square brackets; for example `[97,98,99]' denotes the relation {1,97; 2,98; 3,99}. `Strings' are sequences whose values are characters, ie., small integers. A special notation is provided for strings of characters; `[97, 98, 99]' may be written more conveniently as `"abc"'.

The empty sequence is denoted by `[]'. It is the same object as the empty set, denoted by `{}'.

There is a special sequence notation similar to the range notation for sets. The expression `[M..N]' denotes the sequence whose first term is the integer M and whose last term is the integer N. So that:

denotes the relation:
Such a sequence is called a `shifter', as it is often used to shift or select the terms of another sequence. For example:
? [3..5] o "abcdefg".
Sequences, sets, and tuples may be nested as required, e.g., `{(1,2);{3;4};[5,(6,7)]}' denotes the set containing the pair `(1,2)', the set `{3;4}', and the sequence `[5,(6,7)]' of length 2.

It is important to realise that relations are not recurrence formulae. Asking for the values of the relation:

? @{0->1; X->X+1}.
does not enumerate the natural numbers, but yields two values; `(0,1)' and a structure representing `X->X+1'. (Such a structure does not mean anything outside set braces, and causes a warning.)

2.7 Application

A FUNDAMENTAL difference between an algebra and a programming language is the idea of `application'. Applying a relation to an input enumerates outputs:
? 3!{X->X-1;X->X+1}.
If a relation is applied to an argument that fails to unify with any of its argument patterns, it generates no output, and is said to be `inapplicable'. Even if unification occurs, a relation may be inapplicable because of a predicate introduced by `:'. For example, the relation `max' given earlier is inapplicable to (1,1), or any other pair of equal values, and therefore yields no output. A more useful definition of `max' is:
max -> {X,Y->X:X>=Y; X,Y->Y:X=<Y}.
which, if presented with two equal values would yield the same output twice. In practice, this would almost certainly be harmless, although it would usually waste time. A better definition of `max' is:
max -> {X,Y->X:X>=Y; X,Y->Y:X<Y}.
which is how the built-in `max' operator is defined. There are three ways of writing a relational application. We may use the apply operator (`!'):
? (1,3)!max.
or it may be written as a call:
? max(1,3).
or, if the relation is declared as an infix operator (as `max' actually is), we may write:
? 1 max 3.
These three ways of writing relational application have exactly the same effect, and are merely syntactic conveniences. Depending on the context, one may seem more appropriate than the others. There are several other operators closely related to relational application. The first is called `functional application' (`~'), and is written thus:
? max~(1,3).
Functional application differs from relational application only when relational application yields several values; functional application yields only one of them, chosen arbitrarily. Among other things, it is useful when a problem admits several solutions but any solution will do.

A set may be considered as a degenerate relation whose argument-value pairs have values, but no arguments. In this way, the `for all' operator, `@', may be considered as a special form of relational application, applied to an empty argument. Thus, generating the elements of a set is analogous to applying a relation. (The reader may find this analogy either helpful or confusing. It is not important.)

The `@' operator has a `functional' form, called `i', (which can be read as `one of'). So that:

? i{1;2;3}.
will choose 1, 2 or 3, although it is not possible to say which in advance. It is useful when it is desired to force elements of a set to be considered sequentially rather than independently. This can be done by choosing an element from a set, removing it, choosing an element from what remains, and so on, until the empty set is reached. (However, this style of programming may indicate that the programmer is thinking too procedurally.) An alternative way to look at a set is as a relation whose argument-value pairs have arguments, but empty values. Thus, given an argument, an empty output results. Libra does not deal in empty results. A similar effect is provided by the membership operator, `?', which returns a boolean result.

Membership testing never yields more than one result, even if there is more than one way an argument can be counted as a member of the set. For example:

? 10?{X:X mod 2=0;X:X mod 5=0}.
(The membership operator is infix; the `find' operator is prefix.) A membership test yields 'False' only if there is no way to count the element as a member of the set. (Prolog typically handles membership testing by distinguishing success from failure. A membership test in Libra results in one of three outcomes: 'True' or 'False', or it may be inapplicable, e.g., when the second operand does not define a set.)

There is an asymmetry between the evaluation of the two operands of the application operators. The argument is always fully evaluated, but the relation is evaluated lazily--only those values matching the argument are evaluated. For example, given the definition:

factorial -> {0->1; N->N*factorial(N-1):N>0}.
and the query:
? factorial(2*3-6).
the expression `2*3-6' is fully evaluated to yield `0', which then unifies with `0' in the `factorial' relation. The argument also unifies with `N' in the second term, but fails the predicate `N>0'. Thus the expression `N*factorial(N-1)' is not evaluated. This illustrates an important point. A program typically defines a complex relation--usually infinite, in that it can be applied to an unbounded set of different arguments. If the whole program had to be evaluated (ie., its output were pre-computed for every possible input) before it could be applied to an argument, Libra would not be a practical programming language. Finally, what happens if a relation cannot be applied to its argument? It simply yields no result. (In Prolog terms, it fails.) This is considered a normal event; there is no run-time diagnostic. On the other hand, the relation from names to the expressions they define is special; reference to an undefined name causes an error.

3. Operators

EXPRESSIONS may be used to specify the results of constructors, and in predicates following the `:' (such that) operator. Both uses of expressions may use only the variables in the pattern that matches the argument.

A name may be defined to yield any form of expression, not just a relation. An occurrence of a name is exactly equivalent to an occurrence of any expression it defines. If a name is overloaded, each expression is substituted in turn, and all the applicable expressions are used. For example, the two definitions:

are exactly equivalent to the single definition:
neighbour->{X->X+1; X->X-1}.
Complex expressions are made more readable by using operators. Appendix A lists all Libra's built-in operators.

Apart from syntax, operators are simply relations. Unary operators take a single argument; binary operators have pairs as arguments, e.g.,

(+)->{X,Y->(X,Y)\\(+):X?relations & Y?relations}.
overloads the definition of binary `+' so that it also signifies vector and matrix addition. It is possible for the programmer to define new operators, using the `fx', `fy', `xf', `yf', `xfx', `xfy', and `yfx' commands. (See Section 2.2.)

The semantics of operator use are the same as relational application, not functional application. That is, the expression `X+Y' is equivalent to `+(X,Y)' or `(X,Y)!(+)'. It is not equivalent to (+)~(X,Y). This means that:

find @{0;2}+ @{0;1}.
yields all four results (in some order), and not just one of them as:
find (+)~(@{0;2},@{0;1}).
would do. The precedence of operators in Libra follows the following general plan:
  1. Relations and sets
  2. Arithmetic
  3. Comparison
  4. Booleans
  5. Structures
  6. Commands
where the most tightly binding operators are listed first, and the most loosely binding operators are listed last. (Confusingly, as in Prolog, tightly binding operators have low precedence values, and loosely binding ones have high precedence values.) The reason why relational operators were chosen to bind so tightly is that it is often the case that a relational expression will yield a numerical value (e.g., `X!r+X!s'). On the other hand, for a numerical expression to yield a relational value (e.g. `{1,2;3,4}', it would always have to appear in brackets or braces, making precedences irrelevant.

3.1 Set Operators

THE PREFIX `#' operator returns the number of elements of a set, and therefore the number of pairs in a relation, and therefore, in turn, the length of a sequence, e.g:
? #"abcd".
The operator `?' tests for set membership, as in:
which yields 'True' for all values of X in the range 1-8 and 'False' otherwise. The complementary operator `\?' tests for non-membership:
yields 'False' for all values of X in the range 1-8 and 'True' otherwise.. Testing membership of a sequence must be done with care, because sequences are sets of pairs. For example,
? "a"?"abc".
? (1,97)?"abc".
? "a" subset "abc".
(In the first example, the expression "a" is the relation `{(1,97)}', not the pair `(1,97)'.) Care is also needed with:
? @[1..3].
which does not generate the outputs 1-3 in sequence, but the outputs (1,1), (2,2) and (3,3) in an arbitrary order. The operator `join' forms the union of two sets; `A join B' denotes the set whose members are either members of `A' or members of `B' (AB). If an element is a member of both A and B, it appears only once in their union. For example:
? {1;2}join{2;3}.
The intersection of two sets is formed by `meet'; `A meet B' denotes the set of the elements of `A' that are also elements of `B'. For example:
? {1;2}meet{2;3}.
The operator `omit' forms the asymmetric difference of two sets; `A omit B' denotes all the elements of `A' except those that are also elements of `B' (A\B). For example:
? {1;2}omit{2;3}.
The cartesian product of two sets is formed by the operator (the letter) `x'; `A x B' denotes the relation comprising all pairs whose first term is chosen from `A' and whose second term is chosen from `B' (AB). For example:
? {1;2}x{2;3}.
{1,2; 1,3; 2,2; 2,3}
Finally, `sets_of A' denotes the powerset of `A', the set whose elements are all the possible subsets of the elements of `A', including the empty set and `A' itself (2A). For example:
? sets_of {1;2;3}.
{{}; {1}; {2}; {3}; {1;2}; {1;3}; {2;3}; {1;2;3}}  
A set of N members has a powerset containing 2N elements, so that explicitly forming a powerset can be costly. On the other hand:
? {2;3}?sets_of {1;2;3}.
is efficient, because Libra simply tests whether {2;3} is a subset of {1;2;3}.

3.2 Relational Operators

SINCE RELATIONS are sets, they may be combined using any of the set operators. For example:
? 1!({1->2}join{1->3}).
? 1!({1->2}meet{1->3}).
yields nothing.

(The effect is as if the relation were evaluated, then applied, although the reality is that the relation is applied lazily. For example, to evaluate `1!({1->2} meet {1->3}', Libra applies the first relation to `1' to construct `2', then rejects it using the second relation as a filter.) In addition to the set operators, there are several operators that apply only to relations.

The composition `AB' of relations `A' and `B' is denoted by `A o B' (the letter `o'). It consists of all pairs (X,Z) such that there is some Y such that (X,Y) is in A, and (Y,Z) is in B. For example:

? {1,3; 2,3; 3,4} o {1,2; 3,4; 3,5}.
{1,4; 1,5; 2,4; 2,5}
(Although the first relation can yield both 3 and 4, the second relation is not applicable to 4.) Relational composition is an analogue of functional composition. In Libra, `X!(g o f)' has the same effect as `f(g(X))' or `(X!g)!f'.

(Functional composition is evaluated from right to left, but relational composition is evaluated from left to right. Relational composition is also similar to sequential execution in a procedural language. It is guaranteed that the second relation is applied to the result of the first, ie., in left-to-right order. This is important when input or output has to be considered.) Two operators, `else' and `but', provide the less familiar operations of `relational extension' (similar to functional extension) and `over-ride' (AB). If `A' is applicable to an input argument, `A else B' maps it to the outputs of `A', and `B' is ignored. However, if `A' is inapplicable to the argument, `A else B' maps it to the outputs of `B'. For example:

maps (X,Y) to Y only when it cannot map it to X. The expression `A but B' is equivalent to `B else A'. It is often used when it is desired to `update' part of a relation, ie., to derive a relation that is like an existing one, except in some particular, e.g:
{X->X but[2!X,1!X]}
is a relation that `updates' sequence `X' by swapping its first two elements. `X' is a sequence, and therefore a relation. The expression `1!X' applies X to 1, yielding the first element of `X'. Similarly, `2!X' applies X to 2, yielding the second element of `X'. Since the sequence `[2!X,1!X]' is applicable to 1 or 2, the first two terms of X are over-ridden. Since `[2!X,1!X]' is not applicable to 3, 4, and so on, the remaining terms of the sequence are the same as those of X. The inverse `A-1' of relation `A' (denoted in Libra by `A^-1') is such that (X,Y) is a pair in `A-1' if and only if (Y,X) is a pair in `A'. For example:
? "abc"^-1.
{(97,1); (98,2); (99,3)}
(The `^-' operator reverses the pairs of a relation, not a string; the result isn't "cba".) Asking for the inverse of a relation is to ask what inputs can produce a given output. For example, the relation:
has as its inverse the relation finds the positive and negative square roots of the perfect squares. Since Libra cannot solve equations, only generator relations can be reversed. The domain of relation R (dom R) is the set of all X such that (X,Y) is a member of R. The codomain of relation R (codom R) is the set of all Y such that (X,Y) is a member of R. For example:
? dom[5,6,7,8].
{1; 2; 3; 4}
? codom[5,6,7,8].
{5; 6; 7; 8}
There are four `restriction' operators. Each modifies a relation by restricting its arguments or values according to the members of a set.

The left restriction operator `<?' is such that `s<?r' denotes `r' restricted to arguments in `s'. It comprises those pairs in `r' whose first terms are members of `s'. For example:

? {1;3;5}<?[5,6,7,8].
{(1,5); (3,7)}
The left anti-restriction operator `<\?' is such that `s<\?r' denotes `r' restricted to arguments not in `s'. For example:
? {1;3;5}<\?[5,6,7,8].
{(2,6); (4,8)}
Left restriction is often used to apply different relations to different classes of input. For example, given the following definitions:
negative ->{X:X<0}.
positive ->{X:X>0}.
then the relation defined by:
seek_zero -> negative<?increment join positive<?decrement.
has exactly the same effect as:
seek_zero ->{X->X+1:X<0; X->X-1:X>0}.
(The restriction operators are more useful when the relations and sets involved are more complex than in this example.) A non-obvious but important use of left restriction occurs when it desired to convert a constructor into a generator. Consider the relation:
It is not possible to express this relation as a generator in Libra because it is infinite--or too big to store, at any rate. But if it is known that it will be used over a finite range of arguments, it can be evaluated, e.g:
? {1..5}<?add1.
{(1,2); (2,3); (3,4); (4,5); (5,6)}
The right restriction operator `?>' is such that `r?>s' denotes `r' restricted to values in `s'. It comprises those pairs in `r' whose second terms are members of `s'. For example:
? [5,6,7,8]?>{1;3;5}.
The right anti-restriction operator `\?>' is such that `r\?>s' denotes `r' restricted to values not in `s'. For example:
? [5,6,7,8]\?>{1;3;5}.
{(2,6); (3,7); (4,8)}
The right restriction operators are typically used in `generate and test' situations. For example:
? {1..100}?>{X:X mod 2=1}.
{1; 3; 5;... ;99}
Actually, the left and right restriction operators can be simulated by using the built-in `id' relation, which yields an identity relation on a set. `S!id' or `id(S)' yields a relation that, when applied to a member of `S', yields the same member of `S' as its argument, but yields nothing when applied to a non-member of `S'. For example:
? 5!{1;2;3}
? 1!id{1;2;3}
The left restriction `S<?R' could therefore be written as `id(S) o R', and the right restriction `R?>S' could be written as `R o id(S)'.

3.3 Sequence Operators

SEQUENCES are relations, so most set and relational operators can be applied to them. However, they have a few operators of their own.

Sequences may be concatenated, using the operator `&&', so that "ab"&&"cd" is the string "abcd".

It is simple to find the n-th element of a sequence, e.g., 2!"abcd" and "abcd"~2 both yield 98. If a string has a name, it is also possible to use functional notation:

str -> "abcd".
? str(2).
The expression `[2..3]o"abcd"' simplifies to `"bc"'--illustrating how a substring can be selected from a string. It is also straightforward to convert a single character to and from its ASCII equivalent; 1!"a"=97, and [97]="a".

Given the argument `[5,6,7,8]', the built-in function `head' returns the term `5', the function `tail' returns the sequence `[6,7,8]', the function `last' returns the term `8', and the function `front' returns the sequence `[5,6,7]'.

The built-in `sort' function takes a set as argument and yields a sequence whose terms are the elements of the set in Prolog's standard order. For example:

? {2;3;1}!sort.
The `sort' function always sorts into ascending order. However, the `<-' operation reverses a sequence, so that:
? ({2;3;1}!sort)<-.
is a combination that sorts a set into descending order. There is sometimes a need to invent identifiers for objects. The function `unique' appends a unique serial number to a string, so that for example, successive calls of `unique("Vertex_")' will return the strings "Vertex_1", "Vertex_2", and so on.

3.4 Homogeneous Relational Operators

RELATIONS that have the same argument and value types are called homogeneous. They can be applied to their own outputs, and allow the concept of closure. Closure is a very important concept in Libra, analogous to iteration or search.

The infix operator `^+' is used to apply a relation to its own output a fixed number of times; `R^+2' is equivalent to `R o R', `R^+3' is equivalent to `R o R o R', and so on (R2, R3, etc. in mathematical notation). `R^+1' (R1) is simply `R' itself, and `R^+0' (R0) yields its argument as a result, provided `R' can be applied to it.

The postfix operator `^+' forms the transitive closure of a homogeneous relation (R+ in mathematical notation). It is defined by the infinite union:

R+ = R U (R o R) U (R o R o R) ... For example:

?{1->2;2->3;3->4}^+ .
Transitive closure is best understood by thinking of the graph of the relation:

where each edge represents one of its pairs. The members of the transitive closure are the paths in the graph. For example, since there is a path in the graph from 1 to 4, the pair (1->4) is a member of the transitive closure. There is an important restriction on the relations whose closure Libra can compute: they must have acyclic graphs. Consider the cyclic graph:

Finding the transitive closure of the corresponding relation:

?{1->2;2->3;3->4;4->1}^+ .
will result in an infinite computation because the graph contains an infinite number of paths of unbounded length. Libra's implementation of transitive closure is not sophisticated enough to recognise that in exploring longer and longer paths, no new terms are added to the closure. It is entirely possible to devise an algorithm that finds the transitive closure of a cyclic relation. One way is, when exploring a path, to test whether the next vertex to be added to the path is already on it. Another way is to test whether a newly generated pair is already part of the result. Why aren't these methods built into Libra?

There are three reasons. The first is that transitive closure is often used in a simple way, where cycles obviously cannot occur:

(the Fibonnaci series). Testing whether a cycle has occurred would be an unnecessary overhead in such a situation. The second reason is that the transitive closure of a relation can be extremely large. Complex search problems can be modelled by closures. A typical search has the form:
?start!(improve^+ ?>solution).
which finds all solutions by repeatedly using `improve' to transform the initial state `start' until it satisfies `solution'. Keeping track of the states generated by `improve^+' might use up all available storage. An important part of solving a search problem is often to find a clever way of keeping track of previous states. The third reason also relates to search problems: it is usually necessary to record how a solution has been reached. This typically means storing the sequence of moves chosen by `improve'. This sequence becomes part of the input and output of `improve', so, unfortunately, it would defeat an automatic cycle detector. On each iteration around a cycle, the sequence of moves would grow longer. However, a built-in cycle detector could not distinguish the move sequence from the rest of the state, and would consider each iteration to generate a new state.

Despite these remarks, there is nothing to stop the programmer constructing more sophisticated tools for finding transitive closures, and in fact Libra uses one itself when it must evaluate the transitive closure of a relation in extensional form (rather than applying the closure of an intensional form.) Thus:

{1,1; 1,2; 2,1; 2,2}
works correctly, without being trapped. Since Libra must store the whole result anyway, it uses the partial result to check if progress is being made. The `^+' operator always applies its relation at least once. The similar `^*' operator forms the reflexive transitive closure of a relation:

R* = id U R U (R o R) U (R o R o R) ...

where `id' is the identity function on the domain of R. In terms of the graph of the relation, the reflexive transitive closure consists of all pairs of vertices linked by paths, including those of zero length. For example:

? {1->2;2->3;3->4}^* .
{(1,1); (1,2); (1,3); (1,4); (2,2); (2,3); (2,4); (3,3); (3,4); (4,4)}
(The difference between `^+' and `^*' can be likened to that between the `repeat' loop and the `while' loop in a procedural language such as Pascal.) The closure of a function can act like a simple loop:
This defines a factorial relation that uses iteration rather that recursion. It is the composition of three relations. The first maps its argument N onto the pair (N,1). The second is the reflexive transitive closure of a function that forms a product. The third relation suppresses its first argument. Some care was needed in this definition. Omitting the predicate in the second component relation:
would compute the correct result, but then continue to seek further solutions indefinitely. Omitting the zero from the pattern in the third component:
Would deliver the correct solution, but also output a trace of the steps taken to find it. Because a closure operator often needs the termination condition to be written twice, a limit (^^) operator is provided for greater convenience. Mathematically, the limit R^ of relation R yields the pairs that are in the reflexive transitive closure of R, but to whose second term R is applicable. In terms of graphs, the limit operator finds all the paths that cannot be extended further. For example:
? {1->2;2->3;3->4}^^ .
{(1,4); (2,4); (3,4); (4,4)}
The factorial function may also be defined using the postfix limit operator `^^':
It is no longer necessary to restrict the output in the third term because the second term is inapplicable to a pair of the form `(0,F)'. The limit operator applied to a function is the closest thing in Libra to a `while' loop in a procedural language. As mentioned earlier, the inverse `A-1' of relation `A' (denoted in Libra by `A^-1') is such that (X,Y) is a pair in `A-1' if and only if (Y,X) is a pair in `A'. The `^-' operator can take any non-negative exponent. The expression `A^-N' is directly equivalent to `(A^-1)^+N'; ie., it yields all paths of length N in the reversed graph of the relation. For example:
? {1->2;2->3;3->4}^-2.
{(3,1); (4,2)}
(There is no `^-' postfix closure operator analogous to `^+'.)

3.5 Higher-Order Relations

A HIGHER-ORDER relation is one whose values are other relations. The following example defines a class of functions that add a number to their arguments:
When `add' is given the argument `1':
{(A,(1 + A)}
it yields a function that adds 1 to its argument. A similar kind of higher-order relation may be used to yield a set:
Which when given an argument:
{(A:(A > 0))}
yields a filter that contains the positive integers: Among other things, such relations can be combined using restriction operators, e.g:
Higher-order relations of this form allow `currying', commonly used in functional programming. `Set reduce' (>>->) is an important operator that applies its second operand to combine all the values of its first operand. For example:
? @{1;2;3;4}>>->(+).
adds together 1, 2, 3 and 4. Each different second operand of `>>->' yields a different relation. For example:
? @{1;2;3;4}>>->(*).
multiplies together 1, 2, 3 and 4. `Set reduce' offers the neatest way to define a factorial function:
factorial->{N->@{1..N}>>-> *}.
The function maps its argument onto the set of integers from 1 to N, whose members are then generated and multiplied together. If the set being reduced contains one element, the `>>->' operator returns the element itself:
(Its first operand has only one value--a set.) If an attempt is made to reduce the empty set, the `>>->' operator is inapplicable. The definition of `factorial' just given does not deal correctly with an argument of zero, although this is easily corrected:
factorial ->{0->1; N->@{1..N}>>-> *}.
Reduction operators have an important property: they first spawn, then gather together, several threads into a single result. Reduction provides the only way in which parallel threads can interact. In the preceding example, `@(1..N)' generates N threads, which are reduced to one by `*'. The symbol for set reduction emphasises this many-to-one aspect. The values `>>->' gathers together are precisely those of its first operand.

The set reduction operator can be applied to a relation rather than a set. For example:

unit_image ->{X,R->X!R o{Y->{Y}}>>->join}.
defines the `unit image' function, which given an argument and a relation, finds the set of all values yielded by applying the relation to the argument. For example:
? unit_image(1,{1->2;1->3}).
{2; 3}
In this example, the set of values to be reduced is generated by the `!' (apply) operator. Each result obtained by applying R to X is made into a singleton set {Y}, and the resulting sets are joined to give a single result. In practice, reduction using the join of singleton sets is so common that there is a unary form of the `>>->' for precisely this purpose. Using it, the definition of `unit_image' can be given more concisely:
unit_image -> {X,R->X!R>>->}.
Libra has a built-in `image' operator, such that `S image R' yields the set of all the values that can be obtained by applying R to the members of S. `Image' could be defined as follows:
image -> {(R,S)-> @S!R>>->}.
? {1;2}image{1,3;1,4;2,5;3,6}.
The 2nd operand of `reduce' should be a homogeneous, binary, commutative operator. That is, it should be a relation of the form {(X,Y)->Z}, where X, Y and Z have the same type, and also:
op(X,Y) = op(Y,X), 
op(op(X,Y),Z) = op(X,op(Y,Z)).  
For example, the expression:
would give unpredictable results, depending on the order in which pairs of terms were subtracted. The sequence reduction operator, `>>=>', operates on sequences in a similar way. However:
? @[1,2,3,4]>>=>(-).
is guaranteed to yield `1-(2-(3-4))'. In the case of sequence reduction, the operator does not need to be commutative. One use of `>>=>' is to flatten a sequence:
ie., "abcdefxyz". Another potential use of a higher-order relation is as a `package', as follows:
stack -> {
     'Push'-> {X,S->[X]&&S};
     'Pop' -> {S->head(S),tail(S) };
     'Top' -> {S->head(S)}
where 'Push', 'Pop' and 'Top' name relations within the `stack' package. The relations can be referred to as `stack('Push')', `stack('Pop')' and `stack('Top'). Given two relations with the same domain and codomain, any binary operator can be `amplified' using the `zip' operator, `\\'.
 ? ([1,2,3],[4,5,6])\\(+).
The effect of this is to add the terms pairwise, similar to vector addition. Here is a second example:
which yields:
The `zip' operator may be defined as follows:
'\\'-> {(R,S),Op->@(dom R meet dom S)!{X->X,(X!R,X!S)!Op}}.
As a result, if an argument yields multiple values, all combinations are enumerated.
The `\\' operator may be used to define new operators, or to overload existing ones:
defines an operator that is capable of vector addition--among other things. Because of the way `\\' and the built-in `+' operator are defined, there is no ambiguity about which `+' applies in a given case; `\\' can only apply to relations, and the built-in `+' operator only applies to integers. Another nice feature is that vector `+' applies to matrices represented in either of two ways. If a matrix is represented as a function from number pairs to values, the existing definition works straightforwardly, because `\\' makes no assumptions about the domains of the relations involved. If a matrix is represented as a vector of vectors, `+' works by calling itself recursively. The precedences and associativities of all the set and relational operators are as follows:
        50      yf      ^*,^+,<-,^^
        50      fy      sets_of,seqs_of,dom,codom
        50      yfx     ^+,^-
       100      yfx     meet,o,x,<?,?>,<\?,\?>
       125      yfx     &&,join,omit,else,but
       150      yfx     !,~,\\
       150      fy      @,i
       175      yfx     >>->,>>=>,image
       175      yf      >>->
       175      fx       #

3.6 Arithmetic Operators

CURRENTLY, only integer arithmetic is supported. The arithmetic operators are `+' (addition), `-' (subtraction), `*' (multiplication), `/' (division), `^' (exponentiation), `>>' (right shift), `<<' (left shift), `/\' (bitwise and) `\/' (bitwise or), and `mod' (remainder after division), all with their usual Prolog meanings. In addition, `X min Y' yields the smaller of X and Y; and `X max Y' yields the greater of X and Y.

Arithmetic expressions are formed in the usual way. The following relation defines the successor of an integer:

Arithmetic expressions may include functional applications, of the form `fn~(Args)'. The following is yet another definition of a factorial function:
factorial-> {0->1; N->N*factorial~(N-1):N>0}.
Since all functions are relations, relational application is an alternative to functional application:
factorial-> {0->1; N->(N-1)!factorial*N:N>0}.
factorial-> {0->1; N->N*factorial(N-1):N>0}.
The precedences and associativities of the arithmetic operators are as follows:
      200       xfy      ^
      300       yfx      mod
      400       yfx      *,/,<<,>>
      500       fx       -,+
      500       yfx      +,-,/\,\/
      600       yfx      max,min

3.7 Comparison Operators

THE comparison operators `=', `<', `>', `=<', `>=', `\=' may be used to compare arithmetic expressions in the usual way. In addition, `\<' and `\>' are synonyms for `>=' and `=<' respectively.

The same comparison operators may also be used to compare strings in lexicographical order, ie., if A and B are strings, `A<B' is true if A would precede B in a dictionary. Strings are special cases of vectors, so vectors (and vectors of vectors) may be compared similarly. When these operators are applied to other sets or relations, their effects are unpredictable.

Sets may be compared using the following operators:

       Operator    Meaning
      A equal B     A and B have the same members.
      A unequal B   A and B don't have the same members.
      A subset B    Every member of A is a member of B.
      A inside B    Every member of A is a member of B,
                    but some member of B is not a member of A.
      A encloses B  Equivalent to `B inside A'.
      A includes B  Equivalent to `B subset A'.
      A disjoint B  No member of A is a member of B.
(Since sequences are relations and therefore sets, it is important to distinguish `=<' from `subset', for example.)

`A equal B' is true if and only if `A' and `B' represent the same set. `A unequal B' is true if and only if `A' and `B' represent different sets. `A subset B' is true if and only if every member of `A' is also a member of `B'. `A inside B' is true if and only if every member of `A' is also a member of `B' and `B' has at least one member that is not a member of `A'. `A includes B' is a syntactic variant of `B subset A'. `A encloses B' is a syntactic variant of `B inside A'. Finally, `A disjoint B' is true if and only if `A' and `B' share no common members.

All the comparison operators yield the boolean values 'True' and 'False'.

Sets or relations defined using variables are considered equal only if they are isomorphic; that is, if there is a one-one correspondence between their variable names. For example `{X->X+2}' and `{Y->Y+2}' are considered equal, but neither is equal to `{X->X+1+1'}. Although all three clearly define the same function in this case, a computable test for equivalence is impossible in general, and Libra does not attempt one.

The set membership operators `?' and `\?' discussed earlier are also classed as comparison operators.

The precedences and the associativities of the comparison operators are as follows:

      700    xfy    =,\=,<,>,>=,=<,\>,\<,
      700    xfx    ?,\?,..
(The range operator `..' is not a comparison operator, but has the same precedence.)

3.8 Boolean Operators

The boolean operators are `&' (and), `v' (or), and `\' (not), and `=>' (implies), and `<=>' (logically equivalent).

The expression `P&Q' yields 'True' only when both `P' and `Q' are 'True', otherwise it yields 'False'. The expression `P v Q' yields 'True' if either `P' or `Q' are 'True'; it yields 'False' when both `P' and `Q' are 'False'. The expression `\P' yields 'False' when `P' is 'True' and 'True' when `P' is 'False'.

The expression `P=>Q' yields 'False' when `P' is 'True' and `Q' is 'False', and yields 'True' otherwise. The expression `P<=>Q' yields 'True' if and only if `P' and `Q' have the same truth value.

The boolean variables 'True' and 'False' are distinct from the concepts `applicable' and `inapplicable'. A relation may be applicable to 'False', or yield the output 'False'. A relation may be inapplicable to 'True'. An inapplicable relation yields no result, not 'False'.

In the construction:

the predicate `X<8' is inapplicable if the argument `X' is not an integer. If the predicate of a relation is inapplicable, the whole relation is inapplicable. (In this context, the logical complement of `X<8' is not simply `X\>8', but actually `X\?integers v X\>8'.)

Libra allows a shorthand often used in mathematical notation: the expression `X<Y<Z' is interpreted as `X<Y&Y<Z'. This example can be generalised to any number of terms and any choice of comparison operators.

Boolean expressions are evaluated from left to right, until the result is known.

The precedences and associativities of the boolean operators are as follows:

     900        fy      \
     925        xfy     &
     950        xfy     v
     975        xfx     =>,<=>
The `succeeds' relation is a `hook' to the underlying Prolog system, which may allow access to special operating system features. Its argument is a Prolog goal, and its value is `true' is the goal succeeds, and `false' otherwise. It may be used in the following way:
unifies -> {X,Y:succeeds(X=Y)}.
? unifies(1) 1 ? unifies(1,2) false

3.9 Structural Operators and Commands

The precedences and associativities of the structure operators are as follows:
     1000       xfy     ,
     1050       xfy     ->
     1075       xfx     :
     1100       xfy     ;
The commands bind very loosely. In particular, `unary_prec' and `binary_prec' bind looser than any potential operand. Their precedences and associativities are as follows:
     1150       fx      let, find, ?, edit, use, reuse, show, dump, drop
     1150       xfx     fx, fy, xf, yf, xfx, xfy, yfx
     1175       fy      help
     1200       fx      unary_prec, binary_prec

4. Type Checking

BECAUSE operators may be overloaded, it is sometimes necessary to test the type of the argument of a relation. Types are simply sets, and any set expression can be used as a type. Types may be checked using the set membership operator, e.g., `X?integers' is true if `X' is an integer--or a character.

Some useful sets are built-in:

  • integers
  • naturals
  • positives
  • characters
  • booleans
  • literals
  • sets
  • relations
  • sequences
  • strings
  • any
A few of these sets are defined as generators, although their main use is as filters. For example, `X?integers' yields 'True' if and only if `X' is an integer. On the other hand `@integers' will (start to) generate all the integers.

`Naturals' is the set of non-negative integers, and `positives' is the set of positive integers. `Characters' is the set of integers from 0 to 255. These are all generators, as is `booleans', which generates the set `{'True'; 'False'}'.

The remaining sets are filters. The set `literals' includes all atoms beginning with a capital letter--including 'True' and 'False'. The set `sets' contains sets of all kinds, including relations, sequences and strings. `Relations' contains only those sets that consist of ordered pairs. `Sequences' are functions whose arguments range from 1 to N. `Strings' are sequences whose values are characters. `Any' denotes the universal set. `X?any' is always 'True'.

Because any valid set expression may be used to define a type, some of the built-in sets are related, for example:

sets -> sets_of any.
relations -> sets_of (any x any).
It is also possible to define new types:
integer_pairs -> integers x integers.
Finally, the set `grounded' comprises all expressions that are free of variables. This is useful because some operators can only be applied to grounded expressions, for example, it is only possible to evaluate the inverse of a relation if it is in extensional form. The set `symbolic' is the complement of `grounded'.

5. Program Execution

IT IS IMPORTANT to understand how a Libra program is executed, for two reasons: to know what ranks of relation (generator, constructor or filter) allow execution to be possible, and to be able to make some estimate of program complexity.

A Libra program is best thought of as a collection of threads of control. A thread of control carries an argument with it. When a relation is applied to the argument, each result generates a new thread of control. In the current interpreter each result thread is explored in turn by back-tracking, but it is better to consider that they are all executed in parallel; there is no way for the program to communicate between the threads, and the order in which they will be executed is unpredictable anyway. A second way in which several threads can arise is by a thread invoking the `@' (for all) operator, which generates a thread for each member of a set. Closure operators apply a relation to its own results, and can multiply the number of threads exponentially.

What mechanisms remove threads? The simplest is when a relation is inapplicable to its argument. A thread reaches the relation, but no result emerges. This is typical of a `generate and test' strategy, or pruning during a tree search. A similar effect is achieved by the restriction operators, which kill off threads according to whether they are members of a given set or not.

Reduction first multiplies threads and then reduces their number to one, or even zero. When a thread of control reaches a reduction operator, it generates a thread for each value of its first operand. The operand can either simply generate a set, or it can be a multi-valued relation applied to an argument. These threads are generated within an envelope that allows them to be collected together again and combined using the second operand of the closure operator. (The current implementation relies on Prolog's `findall' predicate.) Thus, one thread emerges to match the one that reached the construct--or, if the collection proves empty, none emerges at all.

Functional application (f~X) is a special case of reduction, where the combining operation has the form {(X,Y)->X}, ie., one result is chosen arbitrarily. In the back-tracking interpreter this is implemented by making a `cut' as soon as any thread succeeds. In a parallel implementation it could perhaps be done by selecting the first thread to complete, then killing off the slower threads.

An important aspect of control is ensuring that a program is not trapped in a loop. This can happen in one of two ways: through the use of recursion, or through using a closure operator. Recursion is discouraged in Libra (but not forbidden). (The use of recursion may imply that the programmer is thinking sequentially rather than relationally.) We have already pointed out some of the pitfalls of the closure operator; the programmer must ensure that a closure terminates without being trapped by a cycle. There is little that can be assumed about the order of execution. Libra does not guarantee that alternatives will be chosen in any particular order. For example, the expression:

? 0!{X->X+1;X->X-1}^* .
is capable of generating all the integers, but only in principle. Depth-first search might generate 0,1,2,3... or 0,-1,-2,-3... . A better strategy would be breadth-first search, generating the series 0,1,-1,2,-2,3,-3... . However, Libra does not promise to do either of these things, but might alternately add and subtract one, cycling between the values 0 and 1 indefinitely. What does Libra promise about closure? Only this: a child result cannot be generated until after its parent result has been generated. In the above example, 3 is either the child of 2 or of 4, so 3 cannot be generated until after 2 or 4 has been generated. (Since 4 cannot be generated until after 3 or 5 has been generated, it is easy to prove that 2 must be generated before 3.) This sequential property derives from the composition operator. In the expression `X!r o s', first `r' is applied to `X', then `s' is applied to the results. This is guaranteed in any implementation of Libra.

Another programming consideration is the asymmetry between the treatment of the operands of `apply' (!). The first operand is always evaluated as fully as possible, but the second is always evaluated as little as possible. Normally, the first operand will contain no variables, and is said to be `grounded'. If it does contain variables, and cannot be reduced to a grounded form, it will passed in `symbolic' form, ie., still containing variables, although it may become partially simplified. This allows a constructor or filter to be passed as an argument to a higher order relation. However, if a first operand can be reduced to a grounded form, it always will be. This means that passing an expression as a first operand to some other relation is one way to force its evaluation.

The programmer should be aware that each time any relation is applied to an argument its results are calculated afresh. If a complex relation is likely to be applied to the same argument several times, it may be worthwhile evaluating the relation first, and storing it as a set of argument-value pairs--provided it is not too large. For example:

r->{0,0; 1,1; 2,0; 3,1; 4,0; 5,1}.
? @{0..5}!(r o add1).
causes `add1' to be applied to 0 three times, and to 1 three times. This could be avoided as follows:
? {0;1}<?add1!{X->@{0..5}!(r o X)}.
which evaluates `{0;1}<?add1' to give {0,1;1,2}, which is then used as argument `X' in the second relation. Although the results are the same, the expressions `0+1' and `1+1' are only evaluated once each, then remembered. Although this is not worthwhile in this example, it can become an efficient technique when a complex relation is substituted for `+'. There are several other ways of forcing evaluation. For example, finding the inverse of a relation (R^-1), first forces the relation to be evaluated.

It is difficult to summarise the rules governing the ranks of relations needed by different operators in different modes of use. For example, if `A meet B' is used as a generator, `A' must be a generator, but `B' may be a filter. However, if `A join B' is used as a generator, both `A' and `B' must be generators. As a general rule, the first operand will need to have at least the same rank as the mode of use, so that a generator is needed to generate a set, a constructor is needed to construct a result, but only a filter is needed to test membership. The second operand may need the same or a lower rank, depending on the operator. If a relation has insufficient rank, an error will be detected during program execution. As a rule of thumb, the programmer is advised always to place the higher-ranking operator first, and hope for the best. Currently, the only alternative is to consult the source program of the interpreter.

6. Scope Rules

A PROGRAM is a set of named definitions of relations. Such a set is itself a relation--from names to expressions. In the current implementation of Libra, all definitions are global in scope. Variable names are local to the elements of a relation or set. For example, in the definition:
let change->{X->X+1;X->X-1}.
the two occurrences of X are independent. They are also independent in the following construct:
find 1!(X->X+1}o{X->X+1}.
If they were not, the argument `1' would not only bind the first relation to the pair (1,1+1), but the second relation to (1,1+1) as well, so the second relation would not be applicable to the intermediate result (ie., `2'). A relation or set that has another set as an element may cause a potential ambiguity. For example, what should the following definition do?
let neighbours -> {X->{X-1;X+1}}.
The intention seems clear. We would expect:
find neighbours(1).
{0; 2}
The rule is adopted that a variable name has scope throughout the set element in which it occurs, including any sets occurring within the element.

This means caution is needed in defining higher-order relations. For example, the intention of the following example is to define a function that will evaluate relation `R' over the domain `X', thus producing a generator:

However, in the construction `{X->{X,X!R}}' `X' is bound to the first argument of `eval', but the programmer's intention was for `X' to be bound to an element of the first argument. Libra will detect the problem and issue a warning. The problem is easily corrected:

7. Input and Output

The normal action of Libra is to `find' the values of an expression, displaying them in Libra's input format. This is adequate for ad hoc problem solving and program development, but it is not always user-friendly enough. One problem is that character strings are displayed as sequences of numbers. Although control is needed over input and output operations, they are a problem for Libra, because they have no meaning within the binary relational algebra. Consequently, input and output are mostly treated as side-effects of identity functions, ie., functions that transparently copy their arguments to their results but produce an action outside the program.

A good solution would be to treat files as relations [3]. An indexed file maps an argument--which becomes its key--onto a tuple containing one or more attributes. A sequential file is a sequence: e.g., a text file is a mapping from integers to lines, ie., strings. Alternatively a file could contain a sequence of Libra expressions in input format. The user of Libra can also be treated as a relation, one that maps a question to a response. Unfortunately, there is no easy way--at least that I can see--to implement these ideas portably in Prolog. So for the present, Libra input-output is, frankly, a kludge.

A second problem with input-output in Libra is that, in principle at least, different threads of execution can proceed in parallel. This means that input and output from different threads could become hopelessly inter-twined. For example, two threads could attempt to ask the user a question, but both could complete their output before either got the user's reply. How will the user know which question to answer first? Even in a back-tracking implementation, the order of execution of different threads is unpredictable.

The way this matter is currently resolved is that Libra currently allows only one input and one output file to be in use at one time. They must be assigned using `see' and `tell', in a similar way to Prolog--even to converse with the user. If no output file has been assigned by `tell' no output is possible. Likewise, if no input file has been assigned by `see' no input is possible. Files are released using `seen' and `told'.

In a parallel implementation of Libra, an attempt to `see' an input file or `tell' an output file when one is already is use would cause the requesting thread to wait until the file was released. In the current back-tracking implementation, such an attempt could only happen because another thread has failed to release the file--and never will. This is a programming error. The error is reported, and the attempt fails.

Input-output operations within a thread will occur in a predictable order if they are linked by composition operators. The limit (^^) of a function (which is equivalent to a series of compositions) will also generate an ordered sequence of operations, similar to a while loop. Likewise, the construct `A else B' guarantees to test the applicability of `A' before attempting to apply `B'. Composition, limit and `else' are the only constructs that can will safely guarantee the sequence of input-output operations.

If a thread that has acquired a file branches non-deterministically, Libra does not forbid its branches to read input or write output. Although the order in which branches will transfer data is unpredictable, this might not matter a great deal, and may even be useful during debugging. However, it is not recommended. It is usually better for each thread to acquire the file and release it as needed, or to work within a single thread.

The `see' function takes a string as its argument, and yields an identity function as its result. It has the side-effect of opening for input the file whose name is given by the string. This odd-seeming specification has the effect of making `see(X)' transparent. For example;

find 1!see("user").
makes input from the user's keyboard possible. (The expression `see("user")' yields the identity function, which is applied to `1'.) Similarly, the `tell' function also takes a string as its argument, and yields an identity function as its result. It has the side-effect of opening for output the file whose name is given by the string. For example;
? 1!tell("user").
makes output to the user's display possible. The `speak' function is an identity function that has the two side-effects of opening the user's keyboard for input and opening the user's display for output.

The `seen' relation is an identity function that releases the current input file. The `told' relation is an identity function that releases the current output file. The `spoken' relation is an identity function that releases user's keyboard and display.

The `write' function writes its argument (to the current output file) in input format (the same format as `find'), and yields its argument as a result. For example:

? "abc"!(speak o write o spoken).
(There is no line-feed; `find' displays the result again.) The `debug' function is similar to `write' but always sends its output to `user', preceding it with `debug: ', and following it with a new line.

The `put' function expects a string for its argument, which it writes (to the current output file) as a series of characters. It yields an identity function as a result. For example:

? "abc"!(speak o put("The answer is ") o put o spoken).
The answer is abc{(A,A)}
(Since `speak' is an identity function, and the first `put' yields an identity function, the second `put' operates on the string "abc", yielding an identity function that is copied by `spoken'.) The `nl' function is an identity function that writes (to the current output file) a new-line as a side-effect. For example:
find "abc"!(speak o put o nl o spoken).
Because the output of `find' can be a nuisance, the `null' function may be used to suppress it. The `null' function yields a special result that cannot be generated in any other way, which is recognised by `find', and ignored. For example:
? "Type a number: "!(speak o put o nl o spoken o null).
Type a number:
yields no visible result. (However, `null' is not inapplicable. This matters; e.g., if `null' is used within the first operand of an `else' operator.) The relation `read' reads (from the current input file) a Libra expression terminated by a period (`.'), possibly extending over several lines. Its argument is ignored, and it yields the expression read. At the end of file, `read' is inapplicable.

The relation `get' reads one line from the current input file. Its argument is ignored, and it yields the line read as a string of characters. The last line of the file may be terminated by either an end of line or an end of file. There is no way for Libra to tell which. At the end of file, `get' is inapplicable.

Input and output operations involving the user may be combined to form a dialogue:

? "Type a number: "!(speak o put o get o spoken).
Type a number: -1
If the argument of the conversion function `str_to_int' is a string representing an integer, it returns the integer, otherwise it is inapplicable. For example:
This could obviously be used as follows:
? "Type a number: "!(speak o put o get o spoken o str_to_int).
Type a number: -1
The function `int_to_str' is the inverse of `str_to_int', which makes it useful for output:
? -1!(int_to_str o speak o put o nl o spoken).

8. Some Example Programs

This section explains the development of three example programs. Accordingly, the programs are presented piecemeal. The full texts of the second and third examples are given in Appendix B.

8.1 Copying a File

The first example program is just a `one-liner', which, like most one-liners, is very tricky. Here it is:
copy_file -> {In,Out->[] ! see(In)
                         o tell(Out)
                         o (get o put o nl)^^
                         o told
                         o seen}.
The intention of the definition is that a command of the form:
? copy_file("input","output").
will copy the text file "input" to the file "output". (Setting the output to "user" will display the file.) Basically, the definition is a composition. Because it is a composition, everything happens in sequence from left to right. First, `see(In)' assigns the input channel to `In'. Second, `tell(Out)' assigns the output channel to `Out', Third, a limit construct (^^) does the actual copying. Fourth, `told' releases the output channel. Fifth and last, `seen' releases the input channel.

What about the limit construct? This comprises three steps: `get' reads the next line of input as a string, `put' writes it, and `nl' writes a new line. Because of the limit operator, these three steps are repeated until they cannot be applied any more. This happens when `get' attempts to read past the end of file (`get' is inapplicable). So much for the definition as a procedure. What about its behaviour as a relation? Why does `find' display `{(A,A)}'? The expression `see(In)' yields an identity function, as does `tell(Out)'. So the result of `[]!see(In) o tell(Out)' is simply `[]'. The `get' relation yields a string irrespective of its argument. Its result becomes the argument of the `put' function, which yields an identity function as a result. Since `nl' is transparent, the `get o put o nl' composition yields an identity function. The limit construct as a whole yields the value of the last successful iteration, which is an identity function. The `told' and `seen' relations are also transparent, so that is the result of `copy_file' as a whole. An exception arises if the file is empty, when the original argument to `see' is copied transparently. It really doesn't matter what the argument of `see' is, as long as it has one. To see why, consider the sub-expression `see(In) o tell(Out)'. This is equivalent to `{X->X}o{X->X}', because each term yields an identity function. Libra can apply this composition to any given argument. For example, the first relation may be applied to `1', because `1' matches the pattern `X'. The intermediate result is `1', and in a similar way, the second relation also yields `1'. However, Libra is not smart enough to realise that the composition of two identity functions is also an identity function--or even that `{X->X}' is an identity function. It can only evaluate a composition explicitly if the first relation is a generator. It therefore attempts to treat `{X->X}' as a generator, which yields the term `(X,X)'. Because `(X,X)' contains variables, but is not a set or relation, Libra recognises that it is in trouble, and issues a warning message.

8.2 Corn, Chicken, Dog

As the second example of program development we consider a simple planning problem, of the kind that is easily solved in Prolog.

A farmer has with him a sack of corn, a chicken, and a rather vicious dog. He reaches a river, which he must cross in a small boat. The boat has only space enough for the farmer and one more thing. He must therefore ferry the corn, chicken and dog from the left bank to the right bank of the river one at a time. The problem is that he cannot leave the dog alone with the chicken, for it will certainly eat it, nor can he trust the chicken alone with the corn. How can he ferry them all across safely?

We may sketch a solution immediately. Starting with an initial state in which the farmer, corn, chicken and dog are all on the left bank, the program must choose a sequence of moves that result in all four being moved to the right bank. This suggests the following:

solution -> initial_state ! safe_move^+ = final_state.
We assume that `safe_move' is a relation that transforms a state into a new state, such that the new state is `safe', ie., nothing gets eaten. `Safe_move' is a relation rather than a function, because we expect that several moves are possible from any given state. Notice how the closure operator will compose all possible sequences of choices to implement a search--provided we are careful to make sure it doesn't get trapped in a loop. This solution has a fatal flaw: it will yield `True' if the problem has a solution, or it will be inapplicable if it doesn't. To be useful, the program should yield a plan showing how the job should be done. The plan could be either a sequence of moves, or a sequence of states, or perhaps both. In the solution given here, the plan will be a sequence of states, from which it is easy to deduce the moves.

A second problem is that the closure operator allows the program to be trapped in a cycle. For example, the farmer could ferry the chicken back and forth across the river indefinitely. The way to avoid such cycles is to make sure that each new state added to the plan is not already part of it. This is why making the plan a sequence of states rather than a sequence of moves is the better choice. A good solution should not pass through the same state twice, but it might need to make the same move several times.

The solution should therefore have the form:

solution -> [initial_state] ! add_to_plan^+ ?>solved.
Here, the plan initially consists of a single term: the initial state. We include the initial state in the plan because we want the program to check that it doesn't return to it. Since the test for completion is no longer a simple equality, we use right restriction to make sure the finished plan is in the set `solved'. We may now elaborate `initial_state':
initial_state -> (everything,{}).
everything -> {'Farmer';'Dog';'Corn';'Chicken'}.
We represent a state as a pair. The first member of the pair is the set of things on the left bank of the river, and its second member is the set of things on the right bank. We need to keep to track of the position of the farmer, but it is not necessary to worry about the boat; where the farmer goes, the boat goes too. The problem is solved when the last term of the plan is the desired final state:
solved -> {Plan:last(Plan)=final_state}.
final_state -> ({},everything).
States may be added to the plan using a generate and test strategy:
add_to_plan -> suggest o verify.
where `suggest' first generates a possible new state, then `verify' ensures that it is not already in the plan. The argument of `suggest' is obviously the existing plan, and its output should include the new state, but in addition it needs to copy the existing plan--otherwise `verify' could not check the new state or add it to the plan.
suggest -> {Plan->Plan,last(Plan)!cross_river}.
This definition extracts the existing latest state from the plan, and uses `cross_river' (explained shortly) to generate new states. `Verify' is straightforward, provided that we remember that the members of the plan are not states, but (integer, state) pairs. The set of states already visited is the codomain of the plan:
verify->{Plan,State->Plan&&[State]:State\?codom Plan}.
This appends the sequence consisting of the new state to the existing plan. (The `&&' operator joins two sequences, not a sequence and a term.) We now must make a short digression to explain why we didn't write:
add_to_plan -> {Plan -> Plan && [last(Plan)!cross_river]
                       :last(Plan)!cross_river\?codom Plan}.
The expression `last(Plan)!cross_river' generates a new state. We add it to the plan, provided that it is a new one. Unfortunately, the expression appears twice, and `cross_river' being a relation, there is no guarantee that it will yield the same value in each place. (In fact, it typically won't.) Therefore the state being added to the plan is not necessarily the one that proved to be a new one. The correct two step approach given earlier ensures that the same new state is used in both places. The `cross_river' relation operates on states rather than on plans. Crossing can occur from left to right or from right to left, which need different treatment:
cross_river -> left_to_right join right_to_left.
(Since the two cases are mutually exclusive, we could have equally well written `else' in place of `join'.) Crossing from left to right occurs only when the farmer is on the left bank, which is why we check the precondition of `ferry_object' to ensure it is in the set `farmer_on_left':
left_to_right -> farmer_on_left<?ferry_object\?>unsafe.
Similarly, we use right anti-restriction to ensure the post-condition is not unsafe, e.g., leaving the dog alone with the chicken. (An alternative would have been a right-restriction to `safe'.) `Farmer_on_left' is a set, which may be defined as a filter, because it is always given a known state:
farmer_on_left -> {Left,Right:'Farmer'?Left}.
(A state is a (left bank, right bank) pair. We want to check that the farmer is a member of the set of things on the left bank.) Ferrying an object from left to right is a two step process consisting of first choosing each object on the left bank in turn, then moving it from the left bank to the right. As with `add_to_plan', a two step approach guarantees that the object removed from the left bank is the same object added to the right bank. Whatever object is chosen, the farmer must go with it:
ferry_object -> {Left,Right->Left,Right,@Left}
               o {Left,Right,Choice-> Left omit{Choice}omit{'Farmer'},
                                      Right join{Choice}join{'Farmer'}}.
A significant point here is that since the farmer is on the left bank, the farmer can be chosen to move. If this happens, the farmer is removed from the left bank twice, and added to the right bank twice, but since we are dealing with sets, this won't matter. When the farmer is the chosen object, this models his crossing the river alone. We now need to define the set of unsafe states. Assuming the right bank was left in a safe state by the previous move and will certainly be safe once the farmer reaches it, we only need to concern ourselves with the safety of the left bank. There are three unsafe situations, when dog is left with the chicken, when the chicken is left with the corn, or when all three are left together. An interesting way to express this is as follows:
unsafe -> {Left,Right: Left includes{'Chicken';'Corn'}
                     v Left includes{'Dog';'Chicken'}}.
That is to say, the unsafe states are those where the set of things on the left bank includes the dog and the chicken or the chicken and the corn. This completes the strategy for moving things from left to right. We now have a choice, write an analogous `left_to_right' relation with left and right interchanged, or get the program to exchange left and right:
swap -> {Left,Right->Right,Left}.
Armed with which, the `right_to_left' relation is easy to write:
right_to_left -> swap o left_to_right o swap.
That is the problem solved. However, the output generated by:
? solution.
does not look especially elegant, and it would be far easier to read if each state were displayed on a new line. We therefore modify `solution' as follows:
solution -> [initial_state] ! add_to_plan^+ ?>solved
                            o display_seq o null.
where `null' is used to suppress the normal output of `find'. The `display_seq' relation has to be written with care, since we have to be sure of the order of execution. The only really safe operators are composition, which guarantees left to right execution, and limit (^^) of a function, which is similar to a loop in a procedural language:
display_seq -> speak
               o ({X->head(X)!write o nl,tail(X)} o {H,T->T})^^
               o nl o spoken.
`Display_seq' begins by opening the "user" file. (`tell("user")' would have been just as good as `speak' here), and finishes by displaying a new line and releasing the "user" file. The interesting part is the limit construct. Its argument is the sequence to be displayed. Its body is the composition of two relations. The first relation generates a pair of values, the head and tail of the sequence. `Write' and `nl' display the head of the sequence and a new line as a side-effect. The second relation discards the head of the sequence, so the overall effect of the composition is simply to find the tail of the sequence. Because of the limit operator, the composition is applied until the sequence becomes empty, whereupon the `head' and `tail' functions become inapplicable, so the limit (the empty sequence) has been found. Two warnings are in order here. It would not do at all to replace the `^^' (limit) operator by the `^*' (closure) operator, because although the same sequence of `write' operations would occur, the construct would yield the resulting sequence as an output after every step. The result would be that the "user" file would be released after the first step, and the second `write' would fail. The second warning is that `display_seq' cannot be simplified as follows:
display_seq -> speak o{X->head(X)!write o nl o tail(X)}^^ o nl o spoken.
This looks plausible because it seems that the limit construct should yield `tail(X)' as before. To see why this won't work, remember that `write' and `nl' are transparent, so that the program (less output) is equivalent to:
display_seq -> {X->head(X)!tail(X)}^^ .
The expression `tail(X)' yields a sequence--a relation from integers to pairs of states. The program tries to apply it to `head(X)', which is a pair of states, and finds that it is inapplicable.

8.3 Word Frequencies

The third example uses relations as data. The problem is, given a character string, to display a list of the words it contains, and the number of times each occurs in the text.

There are several subproblems here: to find the words in the text, to count their frequencies, and to display the results prettily. It is easiest to approach the solution by concentrating on the second sub-problem first. Suppose that we have discovered the list of words in the text; e.g., given the string `"to be or not to be"', we have already derived the sequence `["to","be","or","not","to","be"]'. (It is important to form a sequence, not a set--a set would contain each word only once.)

The inverse of the sequence looks like this:

{"be",2; "be",6; "not",4; "or",3; "to",1; "to",5}
The frequency of a word, e.g., `be', is the size of its image, e.g., `{2;6}'. It would therefore be useful to have a function that could find the size of the image of an argument in a relation. This is easily constructed from Libra's built-in `#' (size) and `image' functions:
image_size->{X,R-> #({X}image R)}.
To create a frequency table, it is merely necessary to apply `image_size' to the arguments in the domain of the inverted word list, and collect the results into a set:
frequencies -> {WordList ->
                WordList^-1 !{Inv-> @dom Inv !{X->{X,image_size(X,Inv)}}>>->}
The expression `WordList^-1' finds the inverse of `WordList', to which it applies a reduction construct. The operand of `>>->' generates each member of the domain of `WordList^-1'--which is the set of words--and then maps each word to a set containing a pair of values: the word itself, and the size of its image. The postfix `>>->' operator collects all these unit sets into a single set.
? ["to","be","or","not","to","be"]!frequencies.
{"be",2; "not",1; "or",1; "to",2}
(Actually, the output of `find' is not so pretty as this; "be" is displayed as `[98,101]' for example. But for the reader's benefit, this problem will be overlooked.) Having solved the nub of the problem, let us consider how to convert a string into a list of words. Although we could scan the string from left to right using a simple two-state machine to detect the boundaries between words, this is an inherently sequential method. It is better--though more challenging--to devise a parallel method.

Recalling that the composition:

 ? [10..12]o"to be or not to be".
will extract a substring, to find a list of words, it is first necessary to find a list of their positions. We are aiming to define `letter_positions' and `word_positions' so that they work together as follows:
?"to be or not to be"!letter_positions.
The method we adopt is to find a list of the positions of first letters of word, and of the last letters of words, then thread them together.

How do we discover the initial and final positions of the letters? An initial letter is one that does not have a letter on its left. A final letter is one that does not have a letter on its right:

initial_positions -> {Posns->(@Posns!{X->X:X-1\?Posns}>>->)!sort}.
final_positions -> {Posns->(@Posns!{X->X:X+1\?Posns}>>->)!sort}.
where `Posns' is the set of positions of all letters in the original text:
? {1;2;4;5;7;8;10;11;12;14;15;17;18}!initial_positions.
? {1;2;4;5;7;8;10;11;12;14;15;17;18}!final_positions.
How do we thread the two sequences together?
word_positions -> {LetterPositions ->
which uses the `\\' (zip) operator to combine corresponding initial and final positions with the `[M..N]' construct.
? {1;2;4;5;7;8;10;11;12;14;15;17;18}!word_positions.
The next thing to do is to define which characters are part of words, and which are spaces between words. The decision is somewhat arbitrary:
letters -> {1!"A"..1!"Z"}join{1!"a"..1!"z"}join{1!"0"..1!"9"}.
(Libra syntax does not allow:
letters -> {1!"A"..1!"Z";1!"a"..1!"z";!"0"..1!"9"}.
There is no generalised `M..N' construct, only `{M..N}' or `[M..N]').

Finally, we need to find the letter positions. This can be done by taking the original text--which is a function from positions to characters, and right-restricting it to the set of letters. The domain of this restricted function is the set of letter positions:

letter_positions -> {Text->dom(Text?>letters)}.
?"to be or not to be"!letter_positions.
which operates as promised earlier. Composing this with `word_positions' gives the following:
 ? "to be or not to be"!letter_positions!word_positions.
The final step is to map the sequences of positions onto sequences of letters. It is tempting to believe that it is merely necessary to apply the original text to the result--after all, the text is a mapping from positions to letters. However, this is not quite right, because the codomain of the sequence above is the set of sequences of positions, but the domain of the text is individual positions. What we need to do is to compose the elements of its codomain with the text. This is an example of a general problem, which we can call `compose_codom':
compose_codom->{R1,R2->(@R1)!{X,Y->X,Y o R2}>>->}.
This higher-order relation takes two other relations as arguments. It generates all (X,Y) pairs of the relation `R1', mapping each pair onto a singleton set containing the same value of X, but with Y composed with the relation `R2'. These singleton sets are then joined into a new relation. In terms of this particular problem, `R1' will be the sequence of position sequences above, and `R2' will be the text. Each position sequence will therefore be mapped onto a letter sequence. Putting it all together, we get:
word_list-> {Text->
             compose_codom(Text!letter_positions!word_positions, Text)}.
?"to be or not to be"!word_list.
a surprisingly painful process. (The author is by no means convinced that this is the best way to solve this problem.) In practice, such a program would read the text string from a file. A simple adaptation of the file copying program in Section 8.1 will do the job:
file_to_str -> {F->[]!see(F)o ({Str->Str&&" "&&get(0)})^^ o seen}.
? file_to_str("hamlet").
" to be or not to be"
Each iteration of the limit construct appends a space and a line of text to `Str'. The value of `Str' before the first iteration is the empty sequence, which is passed transparently through `see(F)'. The argument of `get' doesn't matter; the program could just as well say `get(Str)', `get([])', or whatever. It is nice to be able supply the file name interactively. The `ask' relation asks the user a question, and yields the answer:
ask -> speak o put o get o spoken.
This can then be tailored to ask for a file name:
ask_file_name -> ask("Please type the file name: ").
? file_to_str(ask_file_name)!word_list.
Please type the file name: hamlet
We have already defined the `frequencies' relation, which will analyse the word list. However, for better presentation, we want to display the words in decreasing order of frequency. To do this, we need to sort the frequency table by frequency, then reverse the resulting sequence. However, since the frequency table contains (word, frequency) pairs, `sort' would put into alphabetical order. Before sorting it, each pair must be reversed, which is easily done by finding the inverse of the table. We may therefore sketch the analysis program as follows:
analysis -> ((file_to_str(ask_file_name)!word_list!frequencies<)^-1!sort)<- .
? analysis.
Please type the file name: hamlet
[2,"be"; 2,"to";1,"or";1,"not"]
Bearing in mind that `find' actually displays the words as lists of integers, the output should be cleaned up, and the output from `find' should be suppressed:
analysis -> ((file_to_str(ask_file_name)!word_list!frequencies)^-1!sort)<-
            ! display_list o null.
The `display_list' relation, which displays the word list, is similar in structure to `display_seq' in Section 8.2.
display_list-> {WordList->
               WordList ! speak
                        o ({L->display_line(head(L)),tail(L)}o{H,T->T})^^
                        o spoken}.
which reduces the problem to how to display each line. Given that the terms of the inverted frequency table are (frequency, word) pairs, the following relation will do the job:
display_line-> {Freq,Word->
                [] ! put(justify(int_to_str(Freq),6))
                   o put(" ") o put(Word) o nl}.
where `justify' will right justify a string:
justify-> {String,Width->String!{Str->" "&&Str: #Str<Width}^^}.
This relation is a good example of using a global variable. The `justify' relation finds the limit of an inner relation that operates on `Str' by adding leading spaces until it has the width given by `Width'. `Width' is a constant global variable as far as the inner relation is concerned. If `Width' had to be passed as a parameter, the inner relation would have to operate on (string,integer) pairs, and its final output would need be simplified by another relation.

It is important to distinguish `Str' from `String'. Suppose the relation had read:

justify-> {Str,Width->Str!{Str->" "&&Str: #Str<Width}^^}.
Then a call of the form:
? justify("12",6).
would bind `justify' as follows:
justify->{"12",6->"12"!{"12"->" "&&"12": #"12"<6}^^}.
making the inner relation quite useless. Such a definition would trigger Libra to issue a warning, because `Str' appears on the left of `->'. On the other hand, `Width' is merely used in a predicate, so it would not cause a warning.

9. Implementation

LIBRA is implemented as a Prolog program. Because it was developed using Open Prolog [1], and the Open Prolog editor does not support files exceeding 32K bytes--and because the Libra Source program has a lot of comments--it consists of four large files, and a small one to link them together. The four large files are called `parser', `commands', `built_in' and `apply', and the small one is called `libra'. Consulting `libra' causes the other four files to be consulted, and activates Libra's command loop. When the command loop is active, the user may issue any of the commands explained in Section 2.2.

Actually, it doesn't matter much whether the command loop is active or not, because Libra commands are also valid Prolog goals. They are read in by `read', which means that they are automatically parsed, whether the command loop is active or not.

When the command loop is inactive the user may type any Prolog goal, which proved useful when debugging Libra. The only drawback is that the word `let' has to be typed before a definition, otherwise Prolog will try to interpret `->' as `if ... then'--without much success as a rule. If it is necessary to abort execution of a Libra command, the command loop will become inactive, and must be restarted by typing `libra.'

Definitions created by `let' and by operator declarations (`fx', etc.) are stored as Prolog facts. However, before storage, they are first preprocessed. Preprocessing serves three main functions, it converts sets to internal format, it enforces scope rules--including warning the programmer about suspected abuses, and it gets rid of syntactic variants; for example `X<Y<Z' becomes `X<Y & Y<Z'. The same preprocessing is used by `find' to convert its goal to internal form. Currently, the weakest part of this is that the internal form of a set is an ordered list, which is totally inefficient but was the easiest route to a prototype. In particular, testing for membership or applying a relation to an argument implies scanning the lists from left to right. There is no fast access to individual elements. As a result, the complexity of one relational composition is N2, the complexity of two compositions is N3, and so on, so Libra programs soon become computationally complex.

The `parser' file contains the operator declarations that define Libra's syntax, and the preprocessor. The rest of the command interpreter is in the `commands' file, including a pretty printer used by `show' that converts internal form back to external form. It cannot restore the original text because Prolog's `read' predicate loses the original names of variables, and ignores comments totally. However, it does display definitions in canonical form, making it clear exactly how they were parsed. Theoretically, saving a program with `dump', which uses the same format, then reading it in again with `use', should completely restore it. The commands file also contains the `simplify' predicate, which is responsible for simplifying expressions, ie., executing programs. It is pretty simple however, because all the real work is done by `built_in' or `apply'.

The `built_in' file implements all Libra's built-in definitions. To make execution as lazy as possible, each operator is responsible for deciding when to evaluate its operands. However, `built_in' only implements one mode of using relational operators: full evaluation. When a relational expression is applied to an argument, it is handled by one of three predicates in the `apply' file: `generate', `construct', or `filter'. In contrast to `built_in', these predicates expect their arguments to have been fully evaluated. Which one is called depends on how the argument is used. Generating a set or relation (using `@', etc.) calls `generate', applying a relation to an argument (using `!', etc.) calls `construct', and testing set membership (using `?', etc.) calls `filter'. For any given operator, these three predicates are closely related, so their rules are grouped in threes, by operator. There is a lot of mutual recursion between them, to ensure relations are evaluated as lazily as possible. The Prolog rules for these operators are mostly straightforward, so it is easy to check the ranks of relations involved. For example, the rules for composition are as follows:

generate(R1 o R2,(X,Z)) :- !,generate(R1,(X,Y)),construct(Y,R2,Z).
filter((X,Z),(R1 o R2)) :- construct(X,R1,Y),filter((Y,Z),R2),!.
filter((_,_),(_ o _)) :- !,fail.
construct(X,(R1 o R2),Z) :- !,construct(X,R1,Y),construct(Y,R2,Z).
The rule for `generate' requires the first relation to be a generator, and the second to be at least a constructor. The rules for `filter' require the first relation to be at least a constructor, but the second needs only to be a filter. The rules for `construct' require both relations to be at least constructors. When `built_in' has to fully evaluate a relational expression, it usually calls `generate' to do the work, collecting its results using Prolog's `findall', and finally sorting them to comply with the rule that sets are represented by ordered lists. The exceptions occur where there is a simple method that will do the job faster. The most important of these exceptions is finding the transitive closure of an explicit relation, which is guaranteed to terminate even if the relation is cyclic.

10. Discussion

WHAT, if anything, has been learnt from implementing an interpreter for Libra?

10.1 Type-Checking and Inapplicability

The first lesson--which came as a surprise to the author--is that although the relational programs developed as examples are quite short, they were surprisingly tricky to get right. Some of the difficulties arose because the interpreter was under development, some were due to the author's unfamiliarity with relational programming, and some due to the design of Libra itself. The main problem in the design is the notion of inapplicability. If a relation is presented with an argument to which it does not apply, it simply produces no result. It is fundamental to relational programming that this should occur, but it means that if a programming error leads to the wrong type of argument being passed, then the program as a whole typically produces no result. It is not easy to deduce exactly where the problem lies from an empty output.

There would seem to be several solutions to this problem. They all involve some kind of type-checking. Some checking is already done by Libra's built-in relations. If the wrong kind of argument is passed to one of them, a warning results. However, if the relation is overloaded by the programmer's own definition, Libra turns off its warning, because it cannot tell which definition is meant to be effective in any given case. Even if the programmer's definition proves inapplicable, Libra can't know whether it amounts to an error.

Although the definition of a relation can test the type of its arguments, this only serves to restrict it further. Arguments that do not pass the type test are inapplicable, so the situation is unchanged. It would be possible for a programmer to define a relation so that if its argument had the wrong type, it wrote an error message. However, this is not a modular solution, because overloading the relation to accept a different type of argument (perhaps by a separate package) would require the error condition in the first definition to be modified.

One solution, that adopted by Drusilla, is to infer data types statically from the types of operators, and in turn, to deduce the types of programmer defined relations. In its basic form, this means that the built-in operators cannot be overloaded. However, it is possible to imagine an extension of this idea to allow for overloading. Indeed, Drusilla does precisely this in treating generators, constructors and filters as different types. It seems to the author that allowing overloading is the only logical decision in a language that allows relations to have multiple results.

The author's personal view is that static type-checking is unduly restrictive. Although it can detect many programming errors, it is just one of many checks that could be made. Programs can fail because of division by zero or other arithmetic errors, or by entering an infinite loop or infinite recursion. However, nobody suggests that we should insist that all such errors should be detectable statically. Indeed, if we devise a language in which all programs are guaranteed to terminate, we know it is not Turing-complete, and less than universal. I believe the same kind of objection applies, in a less obvious way, to languages that are statically type-checkable. What the exact problem is, I don't know, but I believe it is the reason why many artificial intelligence languages avoid static type-checking, and why systems programming languages need to have type loop-holes.

One way to solve these problems would be to make a distinction between `inapplicability' and `no result'. If no definition is applicable to an argument, this could be treated as a programming error; or, the definition could remain applicable, but deliver no result. An easy way to implement this would be to extend the use of Libra's `null' relation to deliver no result as a general facility, not merely in connection with `find'.

10.2 The Domain Problem

This leads naturally to a related problem: the question of what defines the domain of a relation. A relation is a mapping from its domain to its codomain. In mathematics, this presents no particular difficulty because a relation is a static object. In a programming language, there are three possibilities for the domain or codomain: their types, their potential values, or their actual values. The distinctions are that the type of a domain or codomain might be the integers, its potential values might be the prime numbers, and its actual values for some given state of execution might be 2, 3 and 5. The `Z' specification language [8] distinguishes two of these cases: the values of a domain are chosen from a `source' type--and the values of a codomain from a `target' type. It might be argued that the set of potential values and the type of an object are the same thing. This is true if the language has a flexible enough way of specifying types. Libra treats types as ordinary sets, so--depending on one's point of view--it either offers a very flexible means of type definition, or none at all. However, in most statically typed languages the means for defining types isn't flexible enough to allow `prime numbers' to be declared as a type.

We have already discussed one situation where these distinctions cause a problem: Libra can't tell when a relation is meant to apply to a given argument. If Libra could determine that an argument was in the domain of the relation but that the relation did not apply, it could consider this a normal result; but if it could determine that the argument wasn't in the domain of the relation--or in the case of overloading, in the domain of any definition of a name--it could consider it an error.

A second aspect of the domain problem arises in finding the reflexive transitive closure (^*) of a relation. Here, it is best to visualise the graph of the relation. The graph contains a number of edges, and the reflexive transitive closure consists of all paths of length zero or more. In addition to the paths of length one or more found by transitive closure (^+), Libra also adds to the closure a loop linking each vertex to itself. The set of vertices is found from the edges of the graph: each edge must leave and enter a vertex of the graph. However, it is possible for a graph to contain isolated vertices. Libra has no way of knowing what these are. The programmer may overcome this shortcoming by defining a graph as a triple: a domain and codomain representing its vertices, and a relation representing its edges. (The domain and codomain would be the same set for a homogeneous graph.) The programmer would be able to construct the loops on its vertices using Libra's built-in `id' function, thus finding its correct reflexive transitive closure.

A related problem occurs in using the limit operator (^^). The limit operator repeatedly applies a relation to an argument until it is no longer applicable. The question is: if the relation is inapplicable to the original argument, is the original argument the limit, or has an error occurred? The answer depends on whether the argument is in the domain of the relation, but Libra cannot tell.

A solution is to force each relation to specify its domain and codomain. Perhaps this might be written as follows:

(+)->{(X,Y) -> (X,Y)\\(+)}::relations x relations -> relations.
The point however is that any set of values could be specified in place of `relations', so that the domain and codomain could be arbitrary sets. One aspect of this is that the existing classification of relations as generators, constructors and filters would change, for example, a constructor whose domain was a generator would become a generator, because all the pairs in the relation could be generated by applying the constructor to the values generated by the domain. Naturally, any solution to the domain problem would also solve the applicability problem of Section 10.1.

10.3 Bags or Sets?

When Libra evaluates a relational expression or applies a complex relation to an argument, it is possible for it to generate the same result several times. When these results are collected or considered as a whole, they form a `bag' rather than a set. They can only be made into a set by removing duplicates. The time-complexities of Libra algorithms are often determined by the sizes of their results--therefore understanding results as bags is needed in order to estimate complexity. Why does Libra support set operations but not bag operations?

This matter received much thought in the design of Hydra [9]. One argument against using bags as a basic data structure is that storing duplicate results isn't usually a good idea. Forming a set of results using reduction is a useful way of saving space and reducing complexity. On the other hand, bags can already be represented within Libra as functions from elements to natural numbers. Such a representation would be no less efficient than any used for sets.

A corollary to using bags is to replace relations by multi-relations, in which the same pair can appear as often as desired. The problem with this is that multi-relations are equivalent to matrices of natural numbers, so why not matrices of integers or reals? In other words, the resulting language would be based on matrices. There is no real problem with this, but developing a language based on bags is really a separate issue, and hasn't been seriously considered here.

The argument that bags are needed to understand aspects of Libra is specious anyway. Many computer languages are interpreted using a run-time stack. That doesn't mean that stacks have to be a basic data structure within the languages.

10.4 Referential Transparency

In the implementation of Libra, a decision was made to evaluate the arguments of relations--using `simplify'--before applying them, except for the built-in relations, which try to achieve the same effect more lazily. Originally, this was motivated by an analogy with procedural languages, which typically require their input parameters to be fully evaluated, but which evaluate conditional constructs lazily. On reflection, it is seen to be an arbitrary decision. For example:
defines an `and' function exactly like the built-in and relation, except that it always evaluates both operands. It would clearly be an advantage if no operand was ever evaluated until its value was needed. It would seem to follow that arguments should be passed by reference rather than by value, or, in terms of the implementation, `simplify' should not be used until necessary. Such a change would have an important consequence. Consider the definition:
which is intended to double the value of its argument. Under the existing implementation of Libra, this operates as follows:
? @{1;2}!double.
However, if `simplify' wasn't called until X was evaluated, the effect would be to evaluate:
? @{1;2}+@{1;2}.
This is certainly referential transparency, but is it sensible? In Section 8.2, we explained exactly why two instances of the same variable should be given the same value in connection with the relation `add_to_plan'. In that example, we wanted the program to make sure that the move it added to its plan was the same move that it had just tested for safety. But the converse is that the same rule destroys the notion of referential transparency, at least as it is usually understood. It is possible to argue--as some functional programming languages do--that variables are an unnecessary concept, and should be eliminated from the language. But to the author, this would be simply sweeping the problem under the carpet. To achieve the effect of `double', it would be necessary to have some way of duplicating an argument. Suppose that there were a built-in relation called `replicate' as follows:
Then we could define `double' as:
double->replicate o (+).
without using variables. But how are we to know that the two values generated by `replicate' are the same? Surely,
? @{1;2}!replicate.
should be equivalent to:
? (@{1;2},@{1;2}).
which generates:
Looking on the positive side, the language of conceptual graphs [12], one of whose aims is to present a variable-free version of the predicate calculus, introduces variables into its linear notation for precisely the same purpose, to indicate when two occurrences of an expression refer to the same object. (In its graphical notation, this is done simply by pointing at the same vertex twice.)

Although variable names are potentially an evil no better than the infamous `goto', the author believes that in the restricted context of a relational definition they are harmless, even beneficial. After all, the alternative is to use built-in relations such as `left' and `right', effectively defined as:

to dissect structured arguments. Libra variables are merely local names for compositions of such relations.

10.5 Input and Output

The current treatment of files within Libra is very unsatisfactory, and should be improved as soon as possible. Part of a remedy would be to treat files as relations. A possible set of file operators was suggested in the language proposal on which Libra is based [3].

In a way, a relational language should have less of a problem integrating with files than a functional language has. One of the properties of files is that they can be modified. This means that the same argument may retrieve a different result depending on when it is applied. Considered as a relation, an argument can be expected to have multiple results. Reading a file could be likened to using the `~' operator, which chooses an arbitrary result. But this is only part of the story. Updating a file has a side-effect on how it will be read, which is not expressed by the Libra programming language. (It isn't expressed by any other programming language either, but that is beside the point.) The core problem is that Libra programs--like functional programs--are expressions, not procedures taking place in time. It is not easy to see how to integrate Libra with the idea of a time-dependent state.

The problems of file input-output are still more manifest when dealing with the user of a Libra program. We have previously discussed how the dialogue generated by different threads can become confusingly intertwined: question and answer have to be treated atomically. But what happens if several threads decide to ask the user the same question? Should each question be answered separately, or should Libra ask the question once and remember the answer? In a relational language, only the first choice is logical, but it may not always prove convenient.

10.6 Nested Scopes and Modules

In the language proposal on which Libra was based [3], it was proposed that names might given limited scope. The reason for this is that in defining a module or library of relations for general use, it might be useful to define subsidiary relations, which would not be of general interest. A programmer using the module should not have to worry about these subsidiary relations. In particular, there is a danger that the programmer will define a new relation with the same name as one in a module, and on invoking it, find that spurious results are produced.

One possible way of avoiding this problem would be to introduce a `where' operator, so that:

{add2->add1 o add1} where {add1->{X->X+1}}.
would have the effect of making `add2' global, but `add1' would be private to the declaration. In general, a set of global definitions would be able to share access to a set of private ones. Simple as this idea is, it is not currently implemented. For one thing, it is hard to see how to integrate it with the one-definition-at-a-time approach of the current command interpreter--whose serial view of the program text is already inappropriate to a relational language.

Another aspect to consider is the object-oriented approach to programming. In the context of Libra, where objects are unimportant, many of the problems that object-oriented languages solve aren't present. On the other hand, it would be useful to have some concept similar to inheritance. This means that if a general type of object has been defined, and a more specialised type of object has been based on it, the specialised object should be able to inherit the relations from the more general object, or over-ride them with its own. For example, if the `number_of_legs' relation applied to members of the set `mammal' yields `4', we would want to over-ride this for the set `humans' to yield `2'. Libra does not provide any means of over-riding one relational definition by another; if two definitions apply, both take effect. Nor does Libra have any way at present of using the knowledge that `humans' are `mammals'.

A partial solution to this problem would be to use `else' rather than `join' to link definitions of relations having the same name. If a name defines several relations, they could be tried in turn. As soon as a definition is found that applies to the argument, no further definitions would be tried. Some means of specifying the search order is needed. Letting the search order be defined by the sequence of the program text seems a poor approach, but Libra does not currently offer any other language feature that could be exploited. A better suggestion, which fits well with the nature of relations, is to allow the program text to contain packages, which are relations from names to definitions, and link them together by an arbitrary relational expression.

10.7 Efficiency

Treating data and program text in a uniform way causes a major problem in the implementation of Libra. Consider an operation such a composition. The expression `R o S' should compile to a Prolog rule something like:
 'R o S'(X,Z) :- R(X,Y),S(Y,Z).
if R and S are static program objects, but to:
'o'(R,S,'R o S').
if they are data objects (where `o' is a built-in predicate). The present implementation represents everything as data, which deals with static objects very poorly. The converse, of expressing data objects as facts means that an implementation would spend much of its time asserting and retracting facts. A better solution would be to have different approaches for static (facts and rules) and dynamic (data structure) sets and relations. One approach is to follow Drusilla, and choose suitable representations hin a compiler. This preserves a uniform syntactic treatment for static and dynamic objects--although there are problems when they are mixed, and in compiling higher-order relations. However, having programmed a few examples has convinced the author that the programmer needs to keep the distinction between dynamic and static objects clear in order to make sensible space-time trade-offs. Perhaps they could be distinguished syntactically, without sacrificing uniformity of notation. For example:
compose->{R,S->{X->X!(R o S)}}.
could define a relation that could be applied to an argument (X), leading to the first translation above; whereas:
compose->{R,S->R o S}.
might apply to data objects. However, this approach raises the new problem of converting between static and dynamic representations.

11. References

1. M. Brady, Open Prolog Version 1.0.3d22, Dept. of Computer Science, Trinity College Dublin, URL:, 1995.

2. D.M. Cattrall, The Design and Implementation of a Relational Programming System, PhD Thesis, Dept. of Computer Science, University of York, 1992.

3. B. Dwyer, Programming Using Binary Relations: a proposed programming language, Technical Report 94-04, Dept. of Computer Science, University of Adelaide, 1994.

4 W.D. Hillis, The Connection Machine, MIT Press, 1985.

5. B. J. MacLennan, Relational Programming, Technical Report NPS52-83-012, Naval Postgraduate School, Monterey, CA, 1983.

6. B. J. MacLennan, Four Relational Programs, Technical Report NPS52-86-023, Naval Postgraduate School, Monterey, CA, 1983.

7 R. Milner, "A theory of type polymorphism in programming", Journal of Computer and System Sciences, 17(3) pp348-375, 1978.

8 B. Potter, J. Sinclair, and D. Till, An Introduction to Formal Specification and Z, Prentice-Hall 1991.

9. J.D. Prosser, A Programming Language Based on the Algebra of Binary Relations, Honours Report, Dept. of Computer Science, University of Adelaide, 1994.

10. J.G. Sanderson, "A Relational Theory of Computing", Lecture Notes in Computer Science 82, Springer-Verlag Berlin 1980.

11. J.G. Sanderson, Relator Calculus, Technical Report 84-02, Dept. of Computer Science, University of Adelaide, 1984.

12. W.M. Tepfenhart, J.P. Dick, J.F. Sowa (Eds.), "Conceptual Structures: Current Practices", Lecture Notes in Computer Science 835, Springer-Verlag Berlin 1994.

Appendix A: Libra Operators

A.1 Relations and Sets

              Operator        Type        Precedence
                ^*              yf               50
                ^+              yf,yfx           50
                ^-              yfx              50
                ^^              yf               50
                <-              yf               50
                sets_of         fy               50
                seqs_of         fy               50
                dom             fy               50
                codom           fy               50
                ?>              yfx             100
                \?>             yfx             100
                <?              xfy             100
                <\?             xfy             100
                meet            yfx             100
                o               yfx             100
                x               yfx             100
                &&              yfx             125
                join            yfx             125
                omit            yfx             125
                else            yfx             125
                but             yfx             125
                !               yfx             150
                ~               yfx             150
                \\              yfx             150
                @               fx              150
                i               fx              150
                #               fy              175
                >>->            yf,yfx          175
                >>=>            yfx             175
                image           yfx             175

A.2 Arithmetic

              Operator        Type        Precedence
                ^               xfy             200
                mod             yfx             300
                *               yfx             400
                /               yfx             400
                <<              yfx             400
                >>              yfx             400
                -               fx,yfx          500
                +               fx,yfx          500
                /\              yfx             500
                \/              yfx             500
                max             yfx             600
                min             yfx             600

A.3 Comparison

              Operator        Type        Precedence
                =               xfy             700
                \=              xfy             700
                <               xfy             700
                >               xfy             700
                =<              xfy             700
                >=              xfy             700
                \>              xfy             700
                \<              xfy             700
                equal           xfy             700
                unequal         xfy             700
                subset          xfy             700
                includes        xfy             700
                inside          xfy             700
                encloses        xfy             700
                disjoint        xfy             700
                ..              xfx             700
                ?               xfx             700
                \?              xfx             700

A.4 Boolean

              Operator        Type        Precedence
                \               fy              900
                &               yfx             925
                v               yfx             950
                =>              xfx             975
                <=>             xfx             975

A.5 Structures

              Operator        Type        Precedence
                ,               xfy             1000
                ->              xfy             1050
                :               xfx             1070
                ;               xfx             1100

A.6 Commands

              Operator        Type        Precedence
                let             fx              1080
                edit            fx              1080
                find            fx              1080
                ?               fx              1080
                use             fx              1080
                reuse           fx              1080
                show            fx              1080
                dump            fx              1080
                drop            fx              1080
                fx              xfx             1080
                fy              xfx             1080
                xf              xfx             1080
                yf              xfx             1080
                xfx             xfx             1080
                xfy             xfx             1080
                yfx             xfx             1080
                help            fx              1090
                unary_prec      fx              1200
                binary_prec     fx              1200

Appendix B: Example Programs

solution -> [initial_state]!add_to_plan^+ ?>solved o display_seq o null.
initial_state -> (everything,{}).
final_state -> ({},everything).
everything -> {'Farmer';'Dog';'Corn';'Chicken'}.
solved -> {Plan:last(Plan)=final_state}.
add_to_plan -> suggest o verify.
suggest -> {Plan->Plan,last(Plan)!cross_river}.
verify -> {Plan,State->Plan&&[State]:State\?codom Plan}.
cross_river -> left_to_right join right_to_left.
left_to_right -> farmer_on_left <? ferry_object \?> unsafe.
farmer_on_left -> {Left,Right:'Farmer'?Left}.
ferry_object -> {Left,Right->Left,Right,@Left}
              o {Left,Right,Choice -> Left omit{Choice}omit{'Farmer'},
                                      Right join{Choice}join{'Farmer'}}.
unsafe -> {Left,Right:Left includes{'Chicken';'Corn'}
                    v Left includes{'Dog';'Chicken'}}.
right_to_left -> farmer_on_left <\? swap o left_to_right o swap.
swap -> {Left,Right->Right,Left}.
display_seq -> speak
               o ({X->head(X)!write o nl,tail(X)}o{H,T->T})^^o nl
               o spoken.

Fig. B.1: Chicken, Corn, Dog.

analysis -> ((file_to_str(ask_file_name)!word_list!frequencies)^-1!sort)<-
             ! display_list o null.
frequencies -> {WordList->
                WordList^-1!{Inv-> @dom Inv!{X->X,image_size(X,Inv)}>>->}}.
word_list -> {Text ->
compose_codom->{R1,R2->(@R1)!{X,Y->X,Y o R2}>>->}.
letter_positions -> {Str->dom(Str?>letters)}.
letters -> {1!"A"..1!"Z"}join{1!"a"..1!"z"}join{1!"0"..1!"9"}.
word_positions -> {Letters->
                    Letters!final_positions) \\ {L,R->[L..R]}}.
initial_positions -> {Letters-> (@Letters!{X->X:X-1\?Letters}>>->)!sort}.
final_positions ->   {Letters-> (@Letters!{X->X:X+1\?Letters}>>->)!sort}.
image_size->{X,R-> #({X}image R)}.
ask -> speak o put o get o spoken.
ask_file_name -> ask("Please type the name of the input file: ").
                   o ({X->X&&" "&&get(0)})^^
                   o seen}.
display_list -> {WordList ->
                 WordList ! speak
                 o ({Seq->display_line(head(Seq)),tail(Seq)}o{H,T->T})^^
                 o spoken}.
display_line -> {Freq,Word -> []!put(justify(int_to_str(Freq),6))
                                 o put(" ") o put(Word) o nl}.
justify->{String,Width->String!{Str->" "&&Str: #Str<Width}^^}.

Fig. B.2: Frequency Analysis

This page has been accessed times since 12th June 1997.

Up to Barry Dwyer's Home Page