Font size

Lucid-Boxes Vs. Black-Boxes

(September 83)

Uzi Vlshkin

Courant Institute, New York University

Ultracomputer Note //65

ABSTRACT. An explicit understanding of the opportunity for

constructing new algorithms out of existing (or supposedly existing)

algorithms is presented.

Say that B, A^ , A2 , ... are problems. We present an abstract

setting that provides for the effective use of algorithms for problems

Aj , A2,... for the design of efficient algorithms for problem B. A

notion of "lucid boxes" which is an extension of "black boxes" is

introduced for this purpose.

We exemplify the applicability of these lucid-box compositions (or

reducibilities) for design and specification of efficient algorithms

and when multi-parameter complexity optimization is required.

Multi-parameter optimization is typical to parallel and distributed

computation environments, where there is need to optimize

simultanously : time, sizes of local memories, communication load on

many lines , etc. .

-2-

1 . In troduction

Every existing algorithm represents some knowledge that has been

acquired. It Is the responsibility of theoreticians In the field to

explore every opportunity to utilize the knowledge accumulated so far

for the future design of algorithms. Often It is easier to use

existing efficient algorithms for another problem in related models of

computation rather than designing a completely new algorithm.

For later in the introduction we need the following. It is well

known that sequential execution of a computer program, say P, on some

input I can be described similarly to a proof in mathematical logic.

Associate a line with each step of this execution. A line that

corresponds to step s (> 1) includes the sequence V of all input and

program (including output) variables and their contents at the

beginning of the step. Next to it, specify the (atomic) instruction

(with respect to a machine or a programming language) which is executed

in step s. The sequence V at the beginning of step s+1 is written in

the next line which is associated with this step.

We outline a methodological framework for the design of efficient

algorithms which is simple, uniform and general. It provides for both:

(1) a direct design of a procedure from atomic instructions, and (2) a

build-up of a composed procedure from a sequence of given procedures.

While, for direct design we suggest using known ways; we introduce some

ideas for composition of new procedures from other (specified or not

fully specified) procedures. It will be evident that other known ways

for composition of procedures can be derived efficiently from our

framework. Our framework takes full advantage of the line-by-line

execution description of existing programs given above. Therefore, we

coin the name lucid boxes to the way in which existing programs are

used. This is in sharp contrast to the concept of 'black boxes', where

only predeclared outputs of existing programs are transparent to

procedures which are composed out of these existing procedures.

Examples, for which the full computational power of our framework

is necessary are presented. Thereby, the need for this framework is

being buttressed. Actually, some of our more powerful examples are

when we need to compose a new procedure out of hypothetically existing

procedures. Examples I, II and III demonstrate the applicability of our

-3-

framework for the design of efficient sequential, parallel and

distributed algorithms or for proving theorems about their existence.

Such typical theorems assert the performance evaluation of new

algorithms as functions of the performance parameters of other

algorithms. Multi-parameter optimization is often required in parallel

and distributed computation environments, where there is need to

optimize simultanously : time, sizes of local memories, communication

load on many lines, etc.. It is conjectured that there is no way to

derive the same theorems using a "black-box build-up" of a composed

procedure from hypothetically existing ones.

We show that this framework encompasses also known techniques for

reducibilities among problems. The terras 'reducibilty ' and

'composition' are used to refer to the same operation. The use of

either terra relates to the significance of the result rather than to

the result itself.

In the present paper we say that a serial algorithm for a problem

is efficient if its running time is bounded by the order of a "low"

degree polynomial in the length of the input and there is no known

algorithm that achieves a better assymptotlcal time bound for the same

problem. This is different from the weaker (less pragmatical but more

theoretically robust) notion of efficient algorithras as used in the

theory of NP-completeness where any polynoraial tirae algorithm is

defined to be efficient. While we give evidence that lucid-box

compositions would probabely not affect the theories that employ this

weaker definition of efficient algorithms, it will certainly have an

impact on raore pragmatical directions In design of algorithms (as

Iraplied by our examples) where our definition of efficient algorithms

Is raore appropriate.

It is interesting to note that lucid-box compositions actually

suggest an answer to a problem suggested implicitly by Aho, Hopcroft,

and Ullman [AHU] . They give a non-standard definition of

NP-completeness ("A language Lq is NP-complete if the following

condition is satisfied: If we are given a deterministic algorithm of

time complexity T(n)>n to recognize Lq, then for every every language L

in NP we can effectively (underlined by the author) find a

deterministic algorithm of time complexity T(pT(n)), where p^ is a

-4-

polynoralal that depends on L"). They do not specify in the text what

is meant by an 'effective' use of an algorithm for the purpose of

obtaining another algorithm. We propose an e xp 1 i c i t understanding of

the term 'effectively' in this definition of NP-corapleteness. (Not for

polynomial time reducibilities only.)

We would like to say that this paper does not belong to the

conventional specific field of Programming Languages. On one hand we

point at an apparent deficiency of conventional programming

methdologies and suggest a wide enough framework to overcome this

deficiency. Still there is much additional work to be done in order to

adapt our framework to various existing programming methodologies and

languages. Section 4 may be viewed as a first (very small) step in

this direction.

The simple example that follows points at unsatisfactory features

of the commonly used "black-box-techniques" for compositions of

procedures since they seem not to cope well with the problem of

specifying the algorithm being described. It also illustrates some of

the more formal definitions of the next section.

An introductory example

A similar example to the one which is presented below was given in

[Meg2] for other purposes. More on the family of examples which is

represented by this example can be found in Example III. I find this

example both simple and intriguing.

Let fj(X) ~ ^i â– â€¢â– ^^1 Â» i ~ l.'Â»Â«.iiÂ» bs pairwise distinct

increasing functions of \ (b^ > 0). For every X, let F(X) denote the

median of the set { f^(X) , . . . ,f^(\)} . Obviously, F(A ) is a piecewise

linear monotone increasing function with 0(n ) break points. Given X,

F(X) can be evaluated in 0(n) time [AHU] (once the f^(X)'s have been

computed). The parametrized median-problem : Solve the equation F(X)

0. One possible way of solving this problem is to first identify the

set of intersection points I = {X^^. / a^ + X^ -bj^ = a^ + ^^-sbj (i ?^ j)} â€¢

Every breakpoint of F is equal to one of the X^.'s. We can thus search

the set I U {-Â»,-Â»Â«} for two values X^, X^ such that F(X^) < < F(\^)

12 1

and such that there is no X ^ . in the open interval (X ,X ). Once X

and X '^ have been found, we can solve the equation readily since F is

-5-

linear over [X ,X ] . The search for such an interval requires O(log n)

F-evaluations and (If we do not wish to presort the set of ^^-i's)

finding medians of subsets of the set of ^j^/s whose cardinalities are

n^, n^/2, n^/4, ... . Thus, such an algorithm runs in 0(n ) time,

which is dominated by the evaluation of all the Intersection points

An alternative approach is as follows. Let us start the

evaluation of F(A ) with X not specified. Assume that each of the

branching decisions of this n-nuraber median computation is based on a

comparison between two input numbers only. Denote the solution of the

equation F(X ) = (which is of course not known at this point) by X .

The outcome of the first comparison, performed by the algorithm for

evaluating F, depends of course on the numerical value of X.

Specifically, if our median-finding algorithm for evaluating F starts

with comparing f^(X) and f2(X), then the intersection point X^2 is a

critical value for this comparison; namely, for X > X ^^2 > ^ i (^ ) ^ f2^^^

while for X < X,2 , fi(^) < f2(^)> or vice versa. Thus, we may find

X,2 . evaluate F(X,2). sind then decide whether X > X -^2 Â°^ ^ ** ^12

according to the sign of F(X,2)Â« We can then proceed with the

evaluation of F(X), where X is still not specified but now restricted

either to (-".X.^l or to [X,2>Â°Â°)' The same idea is repeatedly applied

at the following points of comparison. In general, when we need to

compare f. with f â€¢ and X is currently restricted to an interval

[X',X"], then if X.. does not lie in the Interval then the outcome of

the comparison is uniform over the interval; otherwise, by evaluating

F(X..), we can restrict our interval either to [X ' ,X ^j^ . ] or to [Xj^.,X"].

By the correctness of the median algorithm it can readily be seen

that we finally restricted ourselves to an Interval in which F(X ) is

linear.

Since 0(n) such comparisons are being performed in the known

median-finding algorithm and each may require an evaluation of F (which

amounts to one median-finding), it follows that such an algorithm runs

in O(n^) time. Note that [Meg2] mentions a linear time algorithm for

this problem that uses a straightforward technique. However, Example

III lists references to many algorithms that use a similar technique

-6-

and improve on Che time complexity of all their known "conventional"

counterparts .

The connection to the earlier discussion is as follows. We want

to solve the parametrized median problem. For this we use a median

algorithm in a comparison model of computation, as was specified above.

It should be evident that we actually make a very strong use of

information which is available from the aforementioned line-by-line

description of an execution. Namely, before each comparison we stop

the actual execution, make some computations (that use the parameters

of the comparison) on the side; and then proceed (from the same point)

with the execution of the median algorithm by "feeding" the variables

that have to be compared so as to affect properly the result of the

comparison. So, we can say that steps of the execution of the median

algorithm were transparent to us and we could intervene in this

execution by suspending it and changing contents of variables. The

suspension part is similar to the notion of coroutines in the sense

that whenever a coroutine is activated, it resumes execution of its

program at the point where the action was last suspended (see [K]). We

presented this example for a simple demonstration of the following:

â€¢ there is information which is inherent in the execution process of

any "reasonable" existing algorithm;

â€¢ this information is not contained in usual output specifications of

algorithms;

â€¢ it is advantageous to use this information as soon as it is

available rather than run the existing algorithm to the end.

Re mark . Knuth [K] presents examples where coroutines are used for

elegant and concise (simulation of) system modeling. However,

regarding the use of coroutines for algorithms Knuth asserts that "It

is rather difficult to find short, simple examples of coroutines which

illustrate the importance of the idea". It seems that such an example

has been found here. Of course, any median-finding algorithm can be

modified to solve the parametrized median problem. But if we insist on

'locking its mechanism' and operating only on its output then the

coroutine notion seems to imply better running time than the subroutine

notion. The reason for this is as follows. Suppose that we define all

intermediate results and the complete execution to belong to the

a

-7-

output. We still need to perform the median algorithm all the way from

the beginning in order to resume operation from the point which it was

ctually aborted in the previous call. Since otherwise we would not

know which comparison to perform next as it depends on the results of

previous comparisons.

The next section demonstrates a precise characterization of the

lucid-box composition technique in some (arbitrarily chosen)

computaional environment. Section 3 relates the technique to known

concepts of reducibilities and discusses implications it might have.

Section 4 examines a possible structured-programming description of

applications of the lucid-box technique. Examples 1,11, and III

demonstrate the use of this technique for efficient distributed,

parallel and serial computation. We preferred to put the examples at

the end in order to enable an uninterrupted discussion of the features

of the lucid-box technique.

2 . Lucid-box Compositions of Procedures .

We sketched the notion of lucid-box compositions in the previous

section. A precise definition of this notion in a specific

random-access-machine (RAM) environment is given in the present

section. We hope that these in conjunction with the examples

throughout the paper, will guide the reader to a proper interpretation

of lucid-box compositions in other computational environments than the

one described in this section. This is the main goal of this section.

We try to give a self-contained presentation. However, in a few

cases the reader will be referred to [AHU]. As a mean of presentation

we specify places in this book where 'patches' are proposed. The

reader is assumed to be familiar with the contents of sections 1.2,

1.3, 1.4 and 1.8 where the random access machine (RAM), the

random-access-stored-program-machine (RASP) and Pidgin ALGOL are

introduced. Unlike conventional programming languages Pidgin ALGOL

programs should be read by a human reader rather than a machine.

Pidgin ALGOL permits a succinct presentation of algorithms in this

book. A Pidgin ALGOL program can be translated into a RAM or RASP

program in a straightforward manner. It is necessary to consider time

-8-

and space required to execute the code corresponding to a Pidgin ALGOL

statement or a RAM or RASP. Later we say how to extend Pidgin ALGOL in

order to include lucid-box composition of procedures. For the sake of

the current presentation we introduce a machine which is slightly

different from both the RAM and RASP. The changes relative to the RAM

in the definition of this machine reflect proposed changes in the

definition of the extended Pidgin ALGOL with respect to Pidgin ALGOL;

thereby, implying how to translate statements of the extended Pidgin

ALGOL into this machine. We call this new machine

randora-access-raeraory-and-program machine (RAMP). The only difference

between the RAMP and the known RAM is that the program is located in a

read-only memory. Recall that the indirect read instruction 'READ *i'

means: "copy the register whose number is the content of register i

into the accumulator". The 'READ *i' instruction of our machine may

refer also to locations (registers) in the program part of the machine.

(The instructions are encoded by integers. For an example of a similar

encoding see the presentation of the RASP in [AHU].) This book shows

that the order of time (and space) complexities of the RAM and RASP are

the same for the same algorithm if the cost of instructions is either

uniform or logarithmic. No new ideas are required in order to

establish similar relationships between the RAM (or RASP) and the RAMP.

Let P,,P2,... be existing programs which are written for a RAMP.

We present a framework for specifying a new program P by using these

programs. We do it in two stages. First, we specify P for a model of

computation which contains the RAMP. Later, we show a possible way of

translating P into the RAMP. This may clarify the choice of the RAMP.

The model of computation for which P is defined contains the RAMP

in the following sense. It employs a sequence of RAMPs Rg,R,,... .

The main program for P is located in Rq. This program may be

constructed out of the usual (optionally) labeled RAMP instructions

with respect to Rq. In addition the RAMPs R,,R2,... are attached to

Rq in a "slave-master" relation. Any one of Pi,P2,... may be run on

each of these RAMPs in the usual way with one exception:

The RAMP R^ , i > 1, has a distinguished additional square called

"the dorainator of R^ " (d(Rj) for short) that enables Rq to control its

-9-

operatlon. P. is changed so that the following pair of instructions

must be accessed before every access to any of its instructions.

1. Enter 'red' into d(Rj^).

2. Proceed only if d(Rj^) contains 'green'. (Only Rq may enter 'green'

into d(R.) and R^ has to wait until Rq does so.)

The program for Rq may also have instructions for:

1. Performing all the RAMP instructions with respect to any of the

input, memory or output squares of each R. (with respect to the

accumulator of Rq).

2. Starting any P- on any R^.

3. Entering 'green' into d(R^), i > 1.

4. Reading the P. instructions and especially the next instruction to

be executed into the accumulator of R.

We have already made our point in the previous section regarding

how to relate this composition scheme to our introductory example. Our

procedure starts the median algorithm on R, . Whenever a comparison

between two (copies of) inputs of the median algorithm is going to take

place, our procedure does the following. It 'suspends' the median

algorithm, makes all the side computations required, determines the

result of the comparison by assigning fictitious values to the

corresponding locations of R, and resumes the operation of the median

algorithm.

Let us overview a possible way for mapping our composed procedure

into a single RAMP denoted S.

1. Proper versions of the main programs for P and the programs

P,,P9,... are located in the program part of S.

2. An easy way to compute mapping from input, memory, output and

location-counter locations of Ri,R2,... and memory and

location-counter locations of Rq , into the memory of S has to be

defined. Such a mapping has to use as little memory space of S as

possible. We present one possible mapping. It seems to be especially

appropriate for the (difficult) case where the number of RAMP-s which

are actually employed (denoted by n) and the maximum size of a RAMP's

memory over all their uses (denoted by ra) are not known in advance.

Some easier cases may readily result in more efficient mappings. Let

-10-

the serial number N(i,j) of location 1 of RAMP R . (1 > 1 , j > 0) be the

cardinality of the set

{(Jl,k) I ((A + k < 1 + j) OR (Â£ + k = i + j AND k < j))

AND I > 1 AND k > 0}

(Intuitively, the array of pairs (Â£,k), Â£ > 1 and k > 0, is sorted in a

lexicographic order; first by the (finite length) diagonal that crosses

a pair and second by the serial number of this pair on this diagonal).

So, we map location i of RAMP R. into memory location N(i,j)+c of RAMP

S, where c is some constant. This guarantees that S uses 0((n+m)'')

memory locations.

3. The transition of control, which is done by updating the d(R.)-s,

can be readily simulated. For this, 'activate' the location of S which

corresponds to the location-counter of R^ .

It should be clear that if a number of steps in the 'source'

composed procedure is T, then the number of steps in the translation

into the single RAMP is 0(T).

Let us go back to the extended Pidgin ALGOL. Procedures can now

be invoked by either calling them for the first time or resuming their

operation. Their complete execution (in the way it is described in the

introduction) can be used.

Examples I, II and III demonstrate more instances where this power

of composition of procedures is useful. The reader is advised to read

them at this stage. [V2] contains examples for applications of the

lucid-box composition technique in synchronous parallel and distributed

computation. One of these examples is summarized in Example I. Example

II reviews another application for synchronous distibuted computation

taken from [VI]. A lucid-box composition that uses sorting and merging

networks is given. Example III includes references for further

applications in parametrized computing. An application in asynchronous

distributed computation can be found in [V3] . In order to keep this

presentation within reasonable length we avoid saying more about it.

3 . Relations with Other Notions of Composition .

We first refer to two reducibilities which are commonly used.

-11-

Deflnitlons of widely used terras are sometimes omitted. They can be

found in [GJ] . The most popular technique in the literature for

showing that an algorithm for one decision problem can be used for a

solution of another decision problem uses transformation

reducibilities . That is, a constructive transformation that maps any

instance of the first problem into an equivalent instance of the second

is given. Such a transformation enables us to convert any algorithm

for the second problem into a corresponding algorithm for the first

problem. The well established notion of composition of functions

suggested the composition of a function on an existing one, thereby

creating a new function. This is somewhat similar to the way

transformation reducibilities are defined. However, the fact that an

execution of an existing algorithm may include much more information

than its output is actually ignored by the transformation reducibility.

On the other hand, the extensive applicability of this reducibilty

suggests that in spite of its narrowness it often focuses on the right

things.

The known generalization of transformation reducibility (see [GJ])

strengthens our point of similarity to composition of functions even

more. A Turing reduction from one (search or decision) problem to

another is an algorithm that solves the first problem by using a

hypothetical subroutine for the second problem. This subroutine can be

called more than once. Each time the subroutine is called it operates

(from the point of view of the algorithm) as the function it realizes;

i.e., the input for an application of this subroutine is written by the

algorithm in some specified memory location and then the subroutine

responds by writing the output in some specified memory location. A

similar notion of operation is sometimes referred to as 'black boxes'.

Polynomial time transformation reducibility is often called

"Karp-reducibility" while polynomial time Turing reducibility is often

called "Cook-reducibility". See Section 5.2 in [GJ] for a history of

terminology.

We already Implied that we can alternate freely between

"compositions" and "reducibilties". Thus, we can use lucid-box

compositions for reducibilities between procedures. Since we called

-12-

the composition "lucid-box composition" we call the reducibility

"lucid-box reducibility".

Transformation and Turing reducibilities are instances of

lucid-box reducibilities since they permit us to define reducibilities

among sets of solutions for problems. Unlike these reducibilities

lucid-box reducibilities permit us:

(1) To restrict sets of solutions.

In our examples we demonstrated how useful it might be to restrict

ourselves to comparison models of computation, or to force the copying

assumption or to deal with comparison networks only.

(2) To use the full information in the execution of procedures and not

only their output.

(3) To relate closely the resource complexity of the new procedure to

the resource complexity of the procedures which are used by the

lucid-box reducibility.

Our examples demonstrate the applicability of lucid-box compositiens or

reducibilities for design and specification of efficient algorithms and

in ca ses where multi-parameter complexity optimization is required . As

we already implied in the examples multi-parameter optimization is

typical to parallel and distributed computation environments, where

there is need to optimize simultanously: time, sizes of local memories,

(September 83)

Uzi Vlshkin

Courant Institute, New York University

Ultracomputer Note //65

ABSTRACT. An explicit understanding of the opportunity for

constructing new algorithms out of existing (or supposedly existing)

algorithms is presented.

Say that B, A^ , A2 , ... are problems. We present an abstract

setting that provides for the effective use of algorithms for problems

Aj , A2,... for the design of efficient algorithms for problem B. A

notion of "lucid boxes" which is an extension of "black boxes" is

introduced for this purpose.

We exemplify the applicability of these lucid-box compositions (or

reducibilities) for design and specification of efficient algorithms

and when multi-parameter complexity optimization is required.

Multi-parameter optimization is typical to parallel and distributed

computation environments, where there is need to optimize

simultanously : time, sizes of local memories, communication load on

many lines , etc. .

-2-

1 . In troduction

Every existing algorithm represents some knowledge that has been

acquired. It Is the responsibility of theoreticians In the field to

explore every opportunity to utilize the knowledge accumulated so far

for the future design of algorithms. Often It is easier to use

existing efficient algorithms for another problem in related models of

computation rather than designing a completely new algorithm.

For later in the introduction we need the following. It is well

known that sequential execution of a computer program, say P, on some

input I can be described similarly to a proof in mathematical logic.

Associate a line with each step of this execution. A line that

corresponds to step s (> 1) includes the sequence V of all input and

program (including output) variables and their contents at the

beginning of the step. Next to it, specify the (atomic) instruction

(with respect to a machine or a programming language) which is executed

in step s. The sequence V at the beginning of step s+1 is written in

the next line which is associated with this step.

We outline a methodological framework for the design of efficient

algorithms which is simple, uniform and general. It provides for both:

(1) a direct design of a procedure from atomic instructions, and (2) a

build-up of a composed procedure from a sequence of given procedures.

While, for direct design we suggest using known ways; we introduce some

ideas for composition of new procedures from other (specified or not

fully specified) procedures. It will be evident that other known ways

for composition of procedures can be derived efficiently from our

framework. Our framework takes full advantage of the line-by-line

execution description of existing programs given above. Therefore, we

coin the name lucid boxes to the way in which existing programs are

used. This is in sharp contrast to the concept of 'black boxes', where

only predeclared outputs of existing programs are transparent to

procedures which are composed out of these existing procedures.

Examples, for which the full computational power of our framework

is necessary are presented. Thereby, the need for this framework is

being buttressed. Actually, some of our more powerful examples are

when we need to compose a new procedure out of hypothetically existing

procedures. Examples I, II and III demonstrate the applicability of our

-3-

framework for the design of efficient sequential, parallel and

distributed algorithms or for proving theorems about their existence.

Such typical theorems assert the performance evaluation of new

algorithms as functions of the performance parameters of other

algorithms. Multi-parameter optimization is often required in parallel

and distributed computation environments, where there is need to

optimize simultanously : time, sizes of local memories, communication

load on many lines, etc.. It is conjectured that there is no way to

derive the same theorems using a "black-box build-up" of a composed

procedure from hypothetically existing ones.

We show that this framework encompasses also known techniques for

reducibilities among problems. The terras 'reducibilty ' and

'composition' are used to refer to the same operation. The use of

either terra relates to the significance of the result rather than to

the result itself.

In the present paper we say that a serial algorithm for a problem

is efficient if its running time is bounded by the order of a "low"

degree polynomial in the length of the input and there is no known

algorithm that achieves a better assymptotlcal time bound for the same

problem. This is different from the weaker (less pragmatical but more

theoretically robust) notion of efficient algorithras as used in the

theory of NP-completeness where any polynoraial tirae algorithm is

defined to be efficient. While we give evidence that lucid-box

compositions would probabely not affect the theories that employ this

weaker definition of efficient algorithms, it will certainly have an

impact on raore pragmatical directions In design of algorithms (as

Iraplied by our examples) where our definition of efficient algorithms

Is raore appropriate.

It is interesting to note that lucid-box compositions actually

suggest an answer to a problem suggested implicitly by Aho, Hopcroft,

and Ullman [AHU] . They give a non-standard definition of

NP-completeness ("A language Lq is NP-complete if the following

condition is satisfied: If we are given a deterministic algorithm of

time complexity T(n)>n to recognize Lq, then for every every language L

in NP we can effectively (underlined by the author) find a

deterministic algorithm of time complexity T(pT(n)), where p^ is a

-4-

polynoralal that depends on L"). They do not specify in the text what

is meant by an 'effective' use of an algorithm for the purpose of

obtaining another algorithm. We propose an e xp 1 i c i t understanding of

the term 'effectively' in this definition of NP-corapleteness. (Not for

polynomial time reducibilities only.)

We would like to say that this paper does not belong to the

conventional specific field of Programming Languages. On one hand we

point at an apparent deficiency of conventional programming

methdologies and suggest a wide enough framework to overcome this

deficiency. Still there is much additional work to be done in order to

adapt our framework to various existing programming methodologies and

languages. Section 4 may be viewed as a first (very small) step in

this direction.

The simple example that follows points at unsatisfactory features

of the commonly used "black-box-techniques" for compositions of

procedures since they seem not to cope well with the problem of

specifying the algorithm being described. It also illustrates some of

the more formal definitions of the next section.

An introductory example

A similar example to the one which is presented below was given in

[Meg2] for other purposes. More on the family of examples which is

represented by this example can be found in Example III. I find this

example both simple and intriguing.

Let fj(X) ~ ^i â– â€¢â– ^^1 Â» i ~ l.'Â»Â«.iiÂ» bs pairwise distinct

increasing functions of \ (b^ > 0). For every X, let F(X) denote the

median of the set { f^(X) , . . . ,f^(\)} . Obviously, F(A ) is a piecewise

linear monotone increasing function with 0(n ) break points. Given X,

F(X) can be evaluated in 0(n) time [AHU] (once the f^(X)'s have been

computed). The parametrized median-problem : Solve the equation F(X)

0. One possible way of solving this problem is to first identify the

set of intersection points I = {X^^. / a^ + X^ -bj^ = a^ + ^^-sbj (i ?^ j)} â€¢

Every breakpoint of F is equal to one of the X^.'s. We can thus search

the set I U {-Â»,-Â»Â«} for two values X^, X^ such that F(X^) < < F(\^)

12 1

and such that there is no X ^ . in the open interval (X ,X ). Once X

and X '^ have been found, we can solve the equation readily since F is

-5-

linear over [X ,X ] . The search for such an interval requires O(log n)

F-evaluations and (If we do not wish to presort the set of ^^-i's)

finding medians of subsets of the set of ^j^/s whose cardinalities are

n^, n^/2, n^/4, ... . Thus, such an algorithm runs in 0(n ) time,

which is dominated by the evaluation of all the Intersection points

An alternative approach is as follows. Let us start the

evaluation of F(A ) with X not specified. Assume that each of the

branching decisions of this n-nuraber median computation is based on a

comparison between two input numbers only. Denote the solution of the

equation F(X ) = (which is of course not known at this point) by X .

The outcome of the first comparison, performed by the algorithm for

evaluating F, depends of course on the numerical value of X.

Specifically, if our median-finding algorithm for evaluating F starts

with comparing f^(X) and f2(X), then the intersection point X^2 is a

critical value for this comparison; namely, for X > X ^^2 > ^ i (^ ) ^ f2^^^

while for X < X,2 , fi(^) < f2(^)> or vice versa. Thus, we may find

X,2 . evaluate F(X,2). sind then decide whether X > X -^2 Â°^ ^ ** ^12

according to the sign of F(X,2)Â« We can then proceed with the

evaluation of F(X), where X is still not specified but now restricted

either to (-".X.^l or to [X,2>Â°Â°)' The same idea is repeatedly applied

at the following points of comparison. In general, when we need to

compare f. with f â€¢ and X is currently restricted to an interval

[X',X"], then if X.. does not lie in the Interval then the outcome of

the comparison is uniform over the interval; otherwise, by evaluating

F(X..), we can restrict our interval either to [X ' ,X ^j^ . ] or to [Xj^.,X"].

By the correctness of the median algorithm it can readily be seen

that we finally restricted ourselves to an Interval in which F(X ) is

linear.

Since 0(n) such comparisons are being performed in the known

median-finding algorithm and each may require an evaluation of F (which

amounts to one median-finding), it follows that such an algorithm runs

in O(n^) time. Note that [Meg2] mentions a linear time algorithm for

this problem that uses a straightforward technique. However, Example

III lists references to many algorithms that use a similar technique

-6-

and improve on Che time complexity of all their known "conventional"

counterparts .

The connection to the earlier discussion is as follows. We want

to solve the parametrized median problem. For this we use a median

algorithm in a comparison model of computation, as was specified above.

It should be evident that we actually make a very strong use of

information which is available from the aforementioned line-by-line

description of an execution. Namely, before each comparison we stop

the actual execution, make some computations (that use the parameters

of the comparison) on the side; and then proceed (from the same point)

with the execution of the median algorithm by "feeding" the variables

that have to be compared so as to affect properly the result of the

comparison. So, we can say that steps of the execution of the median

algorithm were transparent to us and we could intervene in this

execution by suspending it and changing contents of variables. The

suspension part is similar to the notion of coroutines in the sense

that whenever a coroutine is activated, it resumes execution of its

program at the point where the action was last suspended (see [K]). We

presented this example for a simple demonstration of the following:

â€¢ there is information which is inherent in the execution process of

any "reasonable" existing algorithm;

â€¢ this information is not contained in usual output specifications of

algorithms;

â€¢ it is advantageous to use this information as soon as it is

available rather than run the existing algorithm to the end.

Re mark . Knuth [K] presents examples where coroutines are used for

elegant and concise (simulation of) system modeling. However,

regarding the use of coroutines for algorithms Knuth asserts that "It

is rather difficult to find short, simple examples of coroutines which

illustrate the importance of the idea". It seems that such an example

has been found here. Of course, any median-finding algorithm can be

modified to solve the parametrized median problem. But if we insist on

'locking its mechanism' and operating only on its output then the

coroutine notion seems to imply better running time than the subroutine

notion. The reason for this is as follows. Suppose that we define all

intermediate results and the complete execution to belong to the

a

-7-

output. We still need to perform the median algorithm all the way from

the beginning in order to resume operation from the point which it was

ctually aborted in the previous call. Since otherwise we would not

know which comparison to perform next as it depends on the results of

previous comparisons.

The next section demonstrates a precise characterization of the

lucid-box composition technique in some (arbitrarily chosen)

computaional environment. Section 3 relates the technique to known

concepts of reducibilities and discusses implications it might have.

Section 4 examines a possible structured-programming description of

applications of the lucid-box technique. Examples 1,11, and III

demonstrate the use of this technique for efficient distributed,

parallel and serial computation. We preferred to put the examples at

the end in order to enable an uninterrupted discussion of the features

of the lucid-box technique.

2 . Lucid-box Compositions of Procedures .

We sketched the notion of lucid-box compositions in the previous

section. A precise definition of this notion in a specific

random-access-machine (RAM) environment is given in the present

section. We hope that these in conjunction with the examples

throughout the paper, will guide the reader to a proper interpretation

of lucid-box compositions in other computational environments than the

one described in this section. This is the main goal of this section.

We try to give a self-contained presentation. However, in a few

cases the reader will be referred to [AHU]. As a mean of presentation

we specify places in this book where 'patches' are proposed. The

reader is assumed to be familiar with the contents of sections 1.2,

1.3, 1.4 and 1.8 where the random access machine (RAM), the

random-access-stored-program-machine (RASP) and Pidgin ALGOL are

introduced. Unlike conventional programming languages Pidgin ALGOL

programs should be read by a human reader rather than a machine.

Pidgin ALGOL permits a succinct presentation of algorithms in this

book. A Pidgin ALGOL program can be translated into a RAM or RASP

program in a straightforward manner. It is necessary to consider time

-8-

and space required to execute the code corresponding to a Pidgin ALGOL

statement or a RAM or RASP. Later we say how to extend Pidgin ALGOL in

order to include lucid-box composition of procedures. For the sake of

the current presentation we introduce a machine which is slightly

different from both the RAM and RASP. The changes relative to the RAM

in the definition of this machine reflect proposed changes in the

definition of the extended Pidgin ALGOL with respect to Pidgin ALGOL;

thereby, implying how to translate statements of the extended Pidgin

ALGOL into this machine. We call this new machine

randora-access-raeraory-and-program machine (RAMP). The only difference

between the RAMP and the known RAM is that the program is located in a

read-only memory. Recall that the indirect read instruction 'READ *i'

means: "copy the register whose number is the content of register i

into the accumulator". The 'READ *i' instruction of our machine may

refer also to locations (registers) in the program part of the machine.

(The instructions are encoded by integers. For an example of a similar

encoding see the presentation of the RASP in [AHU].) This book shows

that the order of time (and space) complexities of the RAM and RASP are

the same for the same algorithm if the cost of instructions is either

uniform or logarithmic. No new ideas are required in order to

establish similar relationships between the RAM (or RASP) and the RAMP.

Let P,,P2,... be existing programs which are written for a RAMP.

We present a framework for specifying a new program P by using these

programs. We do it in two stages. First, we specify P for a model of

computation which contains the RAMP. Later, we show a possible way of

translating P into the RAMP. This may clarify the choice of the RAMP.

The model of computation for which P is defined contains the RAMP

in the following sense. It employs a sequence of RAMPs Rg,R,,... .

The main program for P is located in Rq. This program may be

constructed out of the usual (optionally) labeled RAMP instructions

with respect to Rq. In addition the RAMPs R,,R2,... are attached to

Rq in a "slave-master" relation. Any one of Pi,P2,... may be run on

each of these RAMPs in the usual way with one exception:

The RAMP R^ , i > 1, has a distinguished additional square called

"the dorainator of R^ " (d(Rj) for short) that enables Rq to control its

-9-

operatlon. P. is changed so that the following pair of instructions

must be accessed before every access to any of its instructions.

1. Enter 'red' into d(Rj^).

2. Proceed only if d(Rj^) contains 'green'. (Only Rq may enter 'green'

into d(R.) and R^ has to wait until Rq does so.)

The program for Rq may also have instructions for:

1. Performing all the RAMP instructions with respect to any of the

input, memory or output squares of each R. (with respect to the

accumulator of Rq).

2. Starting any P- on any R^.

3. Entering 'green' into d(R^), i > 1.

4. Reading the P. instructions and especially the next instruction to

be executed into the accumulator of R.

We have already made our point in the previous section regarding

how to relate this composition scheme to our introductory example. Our

procedure starts the median algorithm on R, . Whenever a comparison

between two (copies of) inputs of the median algorithm is going to take

place, our procedure does the following. It 'suspends' the median

algorithm, makes all the side computations required, determines the

result of the comparison by assigning fictitious values to the

corresponding locations of R, and resumes the operation of the median

algorithm.

Let us overview a possible way for mapping our composed procedure

into a single RAMP denoted S.

1. Proper versions of the main programs for P and the programs

P,,P9,... are located in the program part of S.

2. An easy way to compute mapping from input, memory, output and

location-counter locations of Ri,R2,... and memory and

location-counter locations of Rq , into the memory of S has to be

defined. Such a mapping has to use as little memory space of S as

possible. We present one possible mapping. It seems to be especially

appropriate for the (difficult) case where the number of RAMP-s which

are actually employed (denoted by n) and the maximum size of a RAMP's

memory over all their uses (denoted by ra) are not known in advance.

Some easier cases may readily result in more efficient mappings. Let

-10-

the serial number N(i,j) of location 1 of RAMP R . (1 > 1 , j > 0) be the

cardinality of the set

{(Jl,k) I ((A + k < 1 + j) OR (Â£ + k = i + j AND k < j))

AND I > 1 AND k > 0}

(Intuitively, the array of pairs (Â£,k), Â£ > 1 and k > 0, is sorted in a

lexicographic order; first by the (finite length) diagonal that crosses

a pair and second by the serial number of this pair on this diagonal).

So, we map location i of RAMP R. into memory location N(i,j)+c of RAMP

S, where c is some constant. This guarantees that S uses 0((n+m)'')

memory locations.

3. The transition of control, which is done by updating the d(R.)-s,

can be readily simulated. For this, 'activate' the location of S which

corresponds to the location-counter of R^ .

It should be clear that if a number of steps in the 'source'

composed procedure is T, then the number of steps in the translation

into the single RAMP is 0(T).

Let us go back to the extended Pidgin ALGOL. Procedures can now

be invoked by either calling them for the first time or resuming their

operation. Their complete execution (in the way it is described in the

introduction) can be used.

Examples I, II and III demonstrate more instances where this power

of composition of procedures is useful. The reader is advised to read

them at this stage. [V2] contains examples for applications of the

lucid-box composition technique in synchronous parallel and distributed

computation. One of these examples is summarized in Example I. Example

II reviews another application for synchronous distibuted computation

taken from [VI]. A lucid-box composition that uses sorting and merging

networks is given. Example III includes references for further

applications in parametrized computing. An application in asynchronous

distributed computation can be found in [V3] . In order to keep this

presentation within reasonable length we avoid saying more about it.

3 . Relations with Other Notions of Composition .

We first refer to two reducibilities which are commonly used.

-11-

Deflnitlons of widely used terras are sometimes omitted. They can be

found in [GJ] . The most popular technique in the literature for

showing that an algorithm for one decision problem can be used for a

solution of another decision problem uses transformation

reducibilities . That is, a constructive transformation that maps any

instance of the first problem into an equivalent instance of the second

is given. Such a transformation enables us to convert any algorithm

for the second problem into a corresponding algorithm for the first

problem. The well established notion of composition of functions

suggested the composition of a function on an existing one, thereby

creating a new function. This is somewhat similar to the way

transformation reducibilities are defined. However, the fact that an

execution of an existing algorithm may include much more information

than its output is actually ignored by the transformation reducibility.

On the other hand, the extensive applicability of this reducibilty

suggests that in spite of its narrowness it often focuses on the right

things.

The known generalization of transformation reducibility (see [GJ])

strengthens our point of similarity to composition of functions even

more. A Turing reduction from one (search or decision) problem to

another is an algorithm that solves the first problem by using a

hypothetical subroutine for the second problem. This subroutine can be

called more than once. Each time the subroutine is called it operates

(from the point of view of the algorithm) as the function it realizes;

i.e., the input for an application of this subroutine is written by the

algorithm in some specified memory location and then the subroutine

responds by writing the output in some specified memory location. A

similar notion of operation is sometimes referred to as 'black boxes'.

Polynomial time transformation reducibility is often called

"Karp-reducibility" while polynomial time Turing reducibility is often

called "Cook-reducibility". See Section 5.2 in [GJ] for a history of

terminology.

We already Implied that we can alternate freely between

"compositions" and "reducibilties". Thus, we can use lucid-box

compositions for reducibilities between procedures. Since we called

-12-

the composition "lucid-box composition" we call the reducibility

"lucid-box reducibility".

Transformation and Turing reducibilities are instances of

lucid-box reducibilities since they permit us to define reducibilities

among sets of solutions for problems. Unlike these reducibilities

lucid-box reducibilities permit us:

(1) To restrict sets of solutions.

In our examples we demonstrated how useful it might be to restrict

ourselves to comparison models of computation, or to force the copying

assumption or to deal with comparison networks only.

(2) To use the full information in the execution of procedures and not

only their output.

(3) To relate closely the resource complexity of the new procedure to

the resource complexity of the procedures which are used by the

lucid-box reducibility.

Our examples demonstrate the applicability of lucid-box compositiens or

reducibilities for design and specification of efficient algorithms and

in ca ses where multi-parameter complexity optimization is required . As

we already implied in the examples multi-parameter optimization is

typical to parallel and distributed computation environments, where

there is need to optimize simultanously: time, sizes of local memories,