Functional algorithm design - ScienceDirect

21 downloads 58949 Views 1MB Size Report
in constant time, find an O(log n) algorithm for computing index (merge (x, y)) n. .... some degree of freedom, but the spirit of the original should be preserved, and ... We can build the tree from bottom to top by constructing a sequence of trees at.
Science of Computer Programming

Functional algorithm design



Richard S. Bird Programming Research Group, Oxford University, Wolfson Building, Parks Road Oxford, OXI 3QD. UK

Abstract For an adequate account of a functional approach to the principles of algorithm design we need to find new translations of classical algorithms and data structures, translations that do not compromise efficiency. For an adequate formal account of a functional approach to the specification and design of programs we need to include relations in the underlying theory. These and other points are illustrated in the context of sorting algorithms.

1. Introduction As a subject

Algorithm

Design

tion, strategies

in the core curriculum of most undergraduate computing degrees, is concerned with explaining basic strategies for efficient computa-

taught

such as greedy algorithms,

dynamic

programming,

and divide and con-

quer, together with the use of appropriate data structures for representing information. These strategies are illustrated with descriptions of famous algorithms from the literature of computing science. A comprehensive treatment is given in the excellent text [S]. Normally the subject is studied using imperative dictions, but one can also attempt a functional approach. Trying to express standard algorithms an exhilarating and challenging experience. It is exhilarating

in functional form is both because of the amount of

ground that can be covered in a short course, and challenging because many traditional algorithms need to be completely rethought in a functional setting. New algorithms for old are beginning to emerge. For example, one can cite David King’s and John Launchbury’s elegant treatment [13] of various graph algorithms, de Moor’s characterisation of dynamic programming [21], a functional approach to pattern matching [2, 10, Ill, and Chris Okasaki’s recent work [22,23] on purely functional queues. But much remains to be done; for instance, as far as we are aware, there is no effective treatment of the Union-Find problem in a functional setting, a point returned to below. It may be the case, as is suggested in [24, 141, that some classes of algorithm ’Modified version of an invited talk at MPC, 1995 0167-6423/96/$15.00 @ 1996 Elsevier Science B.V. All rights reserved SSDIO167-6423(95)00033-X

R.S. BirdlScience

16

are inherently

inefficient

of Computer Programming 26 (1996)

in any formalism

for programming

15-31

that lacks updatable

but this point has not yet been settled in a satisfactory manner. Unlike other formalisms, functional programming offers a unique opportunity ploit a compositional

approach

to Algorithm

Design,

and to demonstrate

state, to ex-

the effective-

ness of the mathematics of program construction in the presentation of many algorithms. However, for an adequate formal account of programming with functions we need to include relations

in the underlying

theory. With relations

our powers of description

are

increased, and calculations can be unified. But, like the embedding of the real line in the complex plane, the extension to relations should be as seamless as possible, and preserve the shape and simplicity

of its functional

years there has been a growing

appreciation

subset as much as possible.

of the need for relations

In recent

in formal pro-

gram development ([ 1,8, 12, 18, 19,21,25], to cite just a few references), though not everyone takes the view that relational programming is a generalisation of functional programming. In the rest of the paper these remarks

are amplified,

using problems

searching as illustrations. Our aim is to indicate something on programming that a functional approach can reveal.

in sorting and

of the unique

perspective

2. On composition Consider sort x

the following

well-known

functional

version

= [I, = sort y -!+[a] it sort z, where (y, a, z) = split x

split (a : x) = (jilter

( a) x).

It is an academic point (i.e. interesting, even intriguing, but probably not of practical consequence) whether this program can legitimately be called quicksort. After all, the heart of quicksort - the partition phase that burns the candle at both ends - is missing, and there is no notion of an in situ algorithm in functional programming. What is more, quicksort is a terrible algorithm in functional form: its expected running time is easily beaten by mergesort, among others, and it contains a space leak. However, that is not the point at issue here. In most texts on Algorithm Design, sorting is quickly followed, in the same chapter or the following one, with a discussion of selection. The standard expected linear time selection algorithm is introduced by phrases such as “can be modelled on quicksort”, or “follows the structure of quicksort”. But suppose we define select x k = index (sort x) k, where index x k is the kth element

of x (counting

index (a : x) 0 = a index (a : x) (k + 1) = index k x.

from 0):

R.S. BirdlScience

Then we can calculate,

of’Computer Programming 26 (1996)

for nonempty

17

15-31

x:

select x k =

{definition

of select)

index (sort x) k =

{definition

of sort, with (y,a,z)

= split x}

index (sort y + [a] + sort z) k =

{since index (u + u) k = (k < #u + index u k, index v (k - #u))} (k < #sort y + index (sort y) k, index ([a] + sort z) (k - #sort y))

=

{since #sort y = #y = n (say)} (k < n + index (sort y) k, index ([a] Stsort z) (k - n))

=

{last but one step again, and definition

of index}

(k < n --+ index (sort y) k, k = n + a, index (sort z) (k - n - 1)) =

{definition

of select}

(k = (x, a, y> = %jA

(~,a, y))

= a.

The value min (a, b) is the smaller of a and 6. It remains to implement the function mktree.

The standard

algorithm

takes a list

[a~, al,. . .] and builds a tree with a0 at the top, al, a2 at the next level, as, a4,a5, a6 at the next level, and so on until the list is exhausted. The length of the list at the bottom level will not in general be a power of two. Of course, in the array based algorithm no building

actually

takes place; the array is just viewed

as forming

such a tree and

everything is done by juggling subscripts. We can build the tree from bottom to top by constructing a sequence of trees at each level; the trees at the next level higher up are formed by combining trees in pairs with appropriate containing

elements

of the list. At the end of this process we are left with a list

a single tree. To implement

mktree = head . mktrees The function [[aOk

the idea we define

. levels.

levels : list (list A) + list A applied to [a~, al,.

[al,a2],

.] produces

the list

a61> ..1

[a3,a4,a5,

and is defined by an unfold: levels start x isrnil (k,x) level (k,x)

= [isrnil, level] . start = (1,x) = (x = nil) = (take k x,(2 x k,drop

The curried functions of a list. The function

k x))).

take k and drop k, respectively,

mktrees

take and drop the first k elements

: list (tree A) +- list (list A) is defined by

mktrees = @[null], layer), where layer : list (tree A) + (list A x list (t ree A)) is defined by an unfold; to the pair ([ao,ai,. .], [ug, UI , . . .)] this function produces the list York

(~~,a~,ulMork

(~2,a1,~3),..

applied

.I.

If the list [UO,~1,. . .] of trees is not long enough, of empty trees. The definition of layer is: layer islnil (x, ts) step (cons (a,~), nil) step (cons (a,~), cons (24,nil)) step (cons (a,~), cons (u, cons (v, ts)))

= = = = =

it is filled with a sufficient

[is&l, step] (x = nil) (Jerk (null, a, null), (x, nil)) cfork (u, a, null), (x, nil)) cfork (u, a, v), (x, ts)).

number

R.S. BirdIScience

This completes

of Computer

the new definition

reader, to show that mkheap

Programming

of mkheap.

26 (1996)

23

15-31

It is an instructive

exercise,

left to the

takes linear time.

The new version of heapsort shows that some standard algorithms can be translated to functional form while preserving the spirit of the original. But there are other algorithms whose

functional

translations

are not obvious.

In particular,

Kruskal’s

algorithm

for

minimum cost spanning trees uses, in addition to a heap, an algorithm for the UnionFind problem. The Union-Find problem concerns the efficient implementation of three operations

on disjoint

sets, specified

: setA-+set

units units x

= Ha)

as follows:

(setA)

I a Exl

: set (set A)+A+set A = “the (unique) set x in xs that contains

a”

: set (set A) --7‘set A + set A 4 set (set A)

union

union xs x y = (xs - {x} - {y}) U {x U y}. Various schemes for maintaining clear how to achieve comparable

partitions are known [26,27], but currently efficiency in a purely functional setting.

it is not

5. On relations Relations have been knocking at the door, demanding entry for some time now, and it is time to let them in. One reason concerns the nature of the relationship between the fold and unfold operations, and another concerns program specification in general. In a purely functional framework one can model relations by set-valued functions, but the mathematics becomes fussy. It becomes even fussier if we have to model set-valued functions by list-valued ones. With relations things are significantly simpler. Moreover, unlike functions important

every relation

both in specification

has a converse, and program

Consider for instance the following of elements under a preorder 9:

purely

and the use of converse

operations

are

development. functional

specification

of sorting a list

sort = head . jlter uplist . perms uplist x = and [a d b 1 (a, 6) c zip (x, tail x)]. The function perms returns a list of all permutations of a sequence, and the booleanvalued function uplist determines whether a sequence is ascending under 4. While this is an acceptable specification of sort, a better one is to specify sort to be a function satisfying the inclusion sort C uplist?

perm,

(1)

24

R.S. BirdlScience

of Computer Programming 26 (1996)

where perm and uplist? are now relations

rather than functions.

then uplist? . perm

but the expression

is itself a function,

15-31

(If _a is a linear order, is not capable

of being

implemented directly in a standard functional language, so there is still work to do.) Since we want to preserve compatibility with functions, we think of relations as taking arguments

on the right and delivering

results on the left, so our relational

composi-

tion takes the same form as functional composition (we want an ordered permutation, not a permutation of an ordered list). As a relation, uplist? C id, where id is the identity relation on lists, and holds for x just when uplist x is true. A relation R such that R C id is called a corejexive (because a relation R satisfying id CR is a reflexive relation). More generally, p? is the coreflexive that holds for x just when the predicate p x is true. One can define coreflexives by translating the corresponding predicate, but it is usually more satisfactory to define them directly. In particular, one can define uplist? as a relational catamorphism uplist? = @nil, cons . ok?],

ok (a,x) = (‘db : b inlist x : a a b). The relation inlist : A + list A is the membership relation for lists. It is not immediately clear how to define the membership relation for an arbitrary datatype, but the matter was finally settled by Hoogendijk and de Moor in [9]. It would take too long to explain how to define inlist in the relational calculus, so we will just accept it. For the same reason, the following

formal definition

ok? = id n (outlo . (a /inZist”)

of ok? is given without

explanation:

outr).

This “point-free” style is typical in a number of presentations of the relational calculus (see [7]); at first sight it seems arcane, but one soon gets used to it and calculations without variables are significantly simpler. To define the relation perm we need the fundamental operation of taking the converse R” of a relation R, defined by xROy = yRx. Then we can define perm = bagify” . baggy, where bagzjj turns a list into a bag of its elements. Thus perm is defined using bags as an intermediate type: turning a list into a bag and then turning it back into a list gives a permutation of the original. The function bagify can be defined as a catamorphism bagify = (nilbag, consbag], where nilbag is the empty bag and consbag adds an element

to a bag.

R.S. BirdlScience of Computer Programming 26 (1996) 15-31

25

6. On fold and unfold Now let us mm to the formal definitions What follows

of fold and unfold in a relational

will be rather brief and incomplete

enough of the general idea comes across to stimulate thesis [ 161 is a good starting

in various

further reading.

point, as is [ 171 which was written

grammers. And, if you can wait long enough, a complete forthcoming text [4] to be published later this year. As we have seen in the case of lists and trees, whenever number of functions asserts the existence

setting.

ways, but it is hoped

account

Grant Malcolm’s for functional

pro-

will appear in the

one declares

a datatype

a

are brought into play. In part, declaring a datatype as an equation of an isomorphism between the types on the left and right. In the

case of lists this takes the form list A E 1 + (A x list A). The type 1 consists The type constructor can be rephrased

of just one member and serves as the source type for constants. x is Cartesian product, and + is disjoint sum. The right-hand side

as

list A E F(A, list A), where F(A,B) = 1 + (A x B) is a mapping from types to types. We can also use F as a mapping from functions to functions by defining F(f,g)

= id1 + (f x s),

where id, is the identity function on 1. A function having a dual role both as a mapping between types and a mapping between functions is, provided certain properties are satisfied, called a functor. The functor F defined above takes a pair of types or functions as argument and so is sometimes called a bzjiinctor. One property we require of a functor F is that if f :A c B, then Ff : FA +- FB. The other properties identity and composition rules:

are the

Fid = id F(f .g) = Ff .Fg. In the case of bifunctors the rules are, firstly, that if f : A c F( f, g) : F(A,B) +- F(C, 0); and, secondly, that

C and g : B c D, then

F( id, id) = id

F(f.s,h.k)=F(f,h).F(g,k). The Cartesian product constructor x can also be defined as a mapping between ftmctions: if f : A + C and g : B + D, then f x g : A x B t C x D is defined by (f

x g) (c>d) =

This mapping

(f c,g 0

satisfies

the identity

and composition

rules for bifunctors,

so x is a

26

R.S. BirdlScience

of Compuier

Programming

26 (1996)

15-31

bifunctor.

Similarly, the coproduct constructor + can be defined on functions: applied to a left component c, the function f +g : A +B + C+D returns f c as a left component of the result; dually, applied to a right component d, the value of (f + g) d is the right component

g d. Again, the identity

a bifunctor. The declaration

of list A also introduces

nil : list A +- 1 that serve to construct

if

rules are satisfied,

so + is

two functions

cons : list A + A x list A

and

lists. We can parcel these functions

[nil, cons] : list A t In general,

and composition

together as one function

F(A, list A).

f :A + B and g : A + C, then [f, g]:A t B + C applies f to left

components and g to right components. The function [nil, cons] has a special property, which captures the fact that we can define functions on lists by pattern-matching: given any function [c, f ]:B + F(A, B) there is a unique function h : B +- list A such that h . [nil, cons] = [c, f ].F( id, h). Unwrapping

this compact

equation,

we get two equations

h.nil = c h cons = f .(id x h). Thus, h = [c, f 1. In a general datatype

declaration,

which we can write in the form

data A 2-- F(A, data A),

(f1 :B + data A, taking an argument

the catamorphism unique

function

h.a=

f :B + F(A, B), is the

h satisfying

f .F(id,h).

As a consequence of the defining property of a we get that [cl] = id. For example, [nil, consj (which we should have written as ([nil, cons]]) but will not) is the identity function on lists. Less obviously, isomorphism, meaning CI. cP = id

and

it also follows from its defining property that c( is an

cP a = id,

where CC’,more usually written a-‘, denotes the inverse function of CC.The first id is the identity relation on data A, and the second is the identity relation on F(A,dutu A). Since CIis an isomorphism, we can move it to the other side of the defining equation for (f1.Thus, h = (f1 is the unique solution of the equation h=

f. F(id, h) . 8.

We will abbreviate

this by writing

flf 1 = (vh : h = f .F(id, h) . CC’).

R.S. BirdlScience

Finally, into one:

of Computer

we also obtain the extremely

Programminy

26 (1996)

15-31

useful fusion rule for combining

21

two functions

.[gJ=[hJ+f .g=h.Ff.

f

Now, let us extend all this stuff to relations. through when functions tion to functors

are extended

that are monotonic;

to relations,

Everything

we have said above goes

provided

only that we restrict atten-

that is, if R C S then FR c FS. It can be shown

that monotonic functors preserve relational that the expression FR” is not ambiguous.

converse, that is, (FR)” = F(R”). It follows In particular, since

(R . S)” = S” . R”, we get for a relation

R : data A t

F(A,data

[RI = (v/Y : X = R. F(id,X)

x”)

QRJ)”= (vX : X = ct. F(id,X)

R”).

,f : F(A,B)

Given a function

A) that

+ B the unfold operator

Kf] is defined by

KfI = tf"D". The (now) non-standard form [R, f ] that we have used previously for unfold on lists stands more properly for [p + !, f 1,where ! : 1 + B and f :(A x B) + B. With relations we also get two variants of the fusion rule: R.aSDCaTD~R.ScT.F(id,R) R Finally,

(SD >[Tj) writing

e R . S > T F(id, R).

@_X : X = &Y) for the least fixed point (under

relational

inclusion)

of 4, we get the following formalisation of the remark made in Section 3 about the simplification of the composition of a fold over a parameterised type data A with an unfold: (2) It is this transformation

that is behind Wadler’s

deforestation

algorithm

[29].

7. On the derivation of sorting algorithms The formal derivation and classification of sorting algorithms is not, of course, new (see e.g. [6,20]), but let us end with a brief sampler of the kinds of derivations we can accomplish with the above material.

28

R.S. BirdlScience

of Computer Programming 26 (1996) 15-31

7.1. Insertion sort Our first sorting algorithm

arises as a result of the following

three-step

development:

uplist? perm =

{expressing perm in the form [nil, add)} uplist? [nil, addD

>

{fusion} (nil, uplist? . add)

>

{supposing insert 2 uplist? add} (nil, insert]).

In outline,

we can express perm as a relational

to obtain a second catamorphism, phism. The relation

catamorphism,

use fusion with uplist?

and finally refine the result to a functional

catamor-

add for which perrn = [nil, add) can be defined by

add (a,x ity)

= x +[a] ii-y.

It can also be defined recursively

by

add = cons U cons . (id x add) swap (id x cons’),

(3)

where swap (a,(b,x)) = (b,(a,x)). We omit the proof of this fact, as well as most others in this section. The function insert that refines uplist? add can be defined by insert = (ok’ -+ cons, cons . (id x insert) swap . (id x cons’)), where ok’ (a, nil) = true ok’ (a, cons (b,x)) = (a[nil, uplist? . add] . F(uplist?). But this follows

quickly

from the monotonicity

uplist? is a coreflexive. The resulting is, of course, insertion sort.

sorting

of the functor algorithm,

namely

F and the fact that sort = @nil,insertl),

7.2. Selection sort Our second sorting algorithm

comes from the following

uplist? . perm =

{since perm = perm’ and uplist? = uplist?“}

development:

R.S. BirdlScience

29

of Computer Programming 26 (1996) 15-31

(perm . uplist?)” {fusion}

=

. cons. ok?)”

$nil,perm 2

{supposing

select C_ok? . cons’ . perm)

[nil, select’]’ =

{ anamorphisms} [isnil, select].

The result is selection

sort.

7.3. Quicksort Finally,

to derive quicksort

This inclusion

captures

we need the fact that if f is a function,

the fact that functions

We also need the coreflexive uptree? = {null,fork

map arguments

then f. f” C id.

to at most one result.

uptree? defined by

okt?],

where okt (x, a, y) = (‘v’b : b intree x : b < a) A (Vb : b intree y : a < b). Then we can argue along the same lines as in selection uplist? >

=

. flatten

. uptree? .$atten’

= flatten

. uptree?)

’perm

’(perm .jlatten

uptree?)”

flnull,perm

. join . okt?]”

split C okt? .join”

perm; converses}

. ~null,split”l)”

{ anamorphisms} flatten.

=

uplist?

{supposing Patten

=

.JEatten’ .perm

{fusion} Patten

2

: list A +- tree A is a function}

(since perm = perm’ and uptree? = uptree?“) jlatten

=

.$atten

{claim: jlatten

=

perm

{since flatten uplist?

sort:

Kisnull,split].

{introducing

mktree

= [isnull, split]}

Jlatten . mktree. We omit the proof of the claim, and the detailed justification

of the fusion step.

30

R.S. BirdlScience

of Computer

Programming

26 (1996)

15-31

References [I] R.C. Backhouse, P.J de Bruin, G. Malcolm, E. Voermans and J.C.S.P. van der Woude, Relational catamorphisms., in: B. Miiller, ed., Proc. the ZFZP TC2/WG2.1 Working Conf: on Constructing Programs from Specijications (I 991) 287-3 18. [2] R.S. Bird, J. Gibbons and G. Jones, Formal derivation of a pattern matching algorithm, Sci. Comput. Programming 12 (1989) 93-104. [3] R. Bird and P. Wadler, Introduction IO Functional Programming (Prentice Hall, Englewood Cliffs, NJ, 1988). [4] R. Bird and 0. de Moor, The Algebra of Programming (Prentice Hall, Englewood Cliffs, NJ, 1996), To be published. [5] T.H. Cormen, C.E. Leiserson and R.L. Rivest, Introduction to Algorithms (MIT Press, Cambridge, MA, 1990). [6] J. Darlington, A synthesis of several sorting algorithms, Acta Inform. 11 (1978) I-30. [7] P.J. Freyd and A. SEedrov, Categories, Allegories, Mathematical Library, Vol. 39 (North-Holland, Amsterdam, 1990). [8] A.M. Haeberer and P.A.S. Veloso, Partial relations for program development. in: B. Moller, ed., Construciing Programs from Spec$cations, Proc. IFIP TC2iWG2.1 Conference, Pacific Grove, CA, (1991), (North-Holland, Amsterdam, 1991) 3733397. [9] P. Hoogendijk and 0. de Moor, Membership of datatypes, Unpublished Draft, 1993. [IO] R. Hoogerwoord, The design of functional programs: a calculational approach, Ph.D Thesis, University of Eindhoven, 1989. [l I] J. Jeuring, Polytypic pattern matching, in: S. Peyton Jones, ed., Con$ Record of FPCA 1995. SZGPLAN-SZGARCHWG2.8 (I 995) 238-248. [I21 G. Jones and M. Sheeran, Circuit design in Ruby, in: Jorge” Staunstmp, ed., Formal Methods for VLSI Design (North-Holland, Amsterdam, 1990) 13-70. [13] D. King and J. Launchbury, Structuring depth-first search algorithms in Haskell, Proc. ACM Principles of Programming Languages, San Francisco, 1995. [14] J. Launchbury and S.P. Jones, State in Haskell, University of Glasgow, Preprint, 1995. [I51 G. Malcolm, Homomorphisms and promotability, in: J. Snepscheut, ed., 1989 Groningen Mathematics of Program Construction Conf (Springer, Berlin, Lecture Notes in Computer Science, Vol. 375, 1989) 335-347. [I61 G. Malcolm, Algebraic types and program transformation, Ph.D Thesis, University of Groningen, The Netherlands, 1990. [17] E. Meijer, M. Fokkinga and R. Paterson, Functional programming with bananas, lenses, envelopes and barbed wire, in: J. Hughes, ed., Proc. 1991 ACM Conf on Functional Programming and Computer Architecture, Lecture Notes in Computer Science, Vol. 523 (Springer, Berlin, 1991). [ 181 A. Mili, A relational approach to the design of deterministic programs, Acta Znform. 20 ( 1983) 3 155328. [I91 B. Miiller, Relations as a program development language, in: B. Mijller, ed., Constructing Programs fkom Specijcafions, Proc. ZFZP TCZIWG2.Z Conj, Pacific Grove, CA, 1991, (North-Holland, Amsterdam, 1991), 3733397. [20] B. Moller, Algebraic calculation of graph and sorting algorithms, in: D. Bjorner, M. Broy, I.V. Pottosin, eds., Formal methods in Programming and their Applications, Lecture Notes in Computer Science, Vol. 735 (Springer, Berlin, 1993) 3944413. [21] 0. de Moor, Categories, relations and dynamic programming, D.Phil. thesis, Technical Monograph PRG98, Computing Laboratory, Oxford, 1992; Also in Math. Strut. in Comput. Sci. 4 (1994) 33-70. [22] C. Okasaki, Simple and efficient purely functional queues and deques, J. Functional Programming 5, To appear. [23] C. Okasaki and G. Brodal, Optimal purely functional priority queues, J. Functional Programming, To appear. [24] G.C. Ponder, P.C. McGeer and A.P-C. Ng, Are applicative languages inefficient? SZGPLAN Notices 23 (1988) 1355139. [25] G. Schmidt and T. Strohlein, Relations and Graphs, EATCS Monographs on Theoretical Computer Science (Springer, Berlin, 1991). [26] R.E. Tarjan, Efficiency of a good but not linear set union algorithm, J. ACM. 22 (1975) 215-225.

R.S. BirdlScience

of’Computer

Proyramminy

26 (1996)

15-31

31

[27] R.E. Tarjan and J. van Leeuwen, Worst-case analysis of set union algorithms, J. ACM. 31 (1984) 245-281. [28] J.W.J. Williams, Algorithm 232 (heapsort), Commun. ACM 7 (1964) 347-348. [29] P.L. Wadler, Deforestation: transforming programs to eliminate trees, Theoret. Comput. Sci. 2 (1990) 461493.