A Aggarwal.

Parallel computational geometry online

. (page 1 of 6)
Online LibraryA AggarwalParallel computational geometry → online text (page 1 of 6)
Font size
QR-code for this ebook

Robotics Research
Ibchnical Report


Parallel Computational Geometry


A. Aggarwal B. Chazelle
L. Guibas C. O'Diinlaing C. Yap

Technical Report No. 307

Robotics Report No. 115

July, 1987


New York University
nt Institute of Mathematical Sciences

Computer Science Division

251 Mercer Street New York, N.Y 10012

Parallel Computational Geometry


A. Aggarwal B. Chazelle
L. Guibas C. (D'Dunlaing C. Yap

Technical Report No. 307

Robotics Report No. 115

July, 1987

New York University

Dept. of Computer Science

Courant Institute of Mathematical Sciences

251 Mercer Street

New York, New York 10012

The work of the last two authors has been supported by NSF grants #DCR-84-01898 and

Parallel Computational Geometry

A. Aggarwal^ . B. Chazelle^ . L. Gidbai-* . C. 6 'Dunlain^-^'' . C. Yap'-'

1. IBM Thomas J. Watson Research Center
Yorktown Heights, New York 10598

2. Department of Computer Science

Princeton University

Princeton, New Jersey 08544

3. Digital Equipment Corporation Systems Research Laboratories

130 Lytton Avenue

Palo Alto. California 94301

4. Computer Science Department

Stanford University

Stanford, California 94305

5. Courant Institute of Mathematical Sciences

New York University

251 Mercer Street

New York, New York 10012

6. School of Mathematics
Trinity College

Dublin 2
Irish Republic


We present efficient parallel algorithms for several basic problems in computa-
tional geometry: convex hulls, Voronoi diagrams, detecting line segment intersections,
triangulating simple polygons, minimizing a circumscribing triangle, and recursive data-
structures for 3-dimensional queries.

July 14, 1987

• The work of ihcse authors is supported by NSF grants #DCR-8'W1898 and #DCR-g4-0I633.

1. Introduction.

Computational geometry addresses algorithmic problems in diverse areas such as VLSI design,
robotics, and computer graphics. Since 1975 there has been a wide development of sequential algorithms
for geometric problems, but until 1985 there was little published about developing parallel algonthms for
such problems. A notable exception was the Ph.D. research of Anita Chow (1980) which seems to have
been the pioneering work in the field but unfortunately only a portion of it has appeared in the open litera-

This paper contributes some parallel algorithms for solving geometric problems:

(1) Convex hulls in two and three dimensions

(2) Voronoi diagrams and proximity problems

(3) Detecting segment intersections and triangulating a polygon

(4) Polygon optimization problems

(5) Creating data structures in two and three dimensions to answer some standard queries.

It is seldom obvious how to generate parallel algorithms in this area since popular techniques such as
contour tracing, plane sweeping, or gift wrapping involve an explicitly sequential (iterative) approach (see
[Preparaia and Shamos (1985)] for more details). In this paper we exploit data-structures that are simple
enough to compute efficiently in parallel while being powerful enough to answer queries efficiendy. How-
ever, the queries may be answered slightly less efficiently than would be possible if the data-structures
were constructed sequentially. 'Efficient' here means polylogarithmic in parallel time.

All the algoriUims presented here will be A^C-algorithms, i.e., algorithms executable in polylog depth
on polynomial-size circuits. We caution that the circuits here are slightly different from those used in
machine-based complexity theory in that each node of our circuit can compute an infinite precision arith-
metic operation. Furthermore these algorithms shall all be described as if they were implemented on a
CREW PRAM - a concurrent-read, exclusive-wriie parallel random access machine. Such a machine is
conceived as having a large number of processors with common access to a common memory. Any
number of processors can read the same memory cell simultaneously in (1) steps, and any processor can
write to a memory cell in (1) steps, but if two processors attempt to write to the same cell simultaneously
then the machine enters an undefined state.

We should mention here that Anita Chow's dissertation pays close attention to the model of compu-
tation involved, and in several cases she provides algorithms for implementation on two models of parallel
computation: the CREW PRAM model, and the cube-connected cycles (CCC) network. As mentioned
before, in the CREW model, all the processors share the same memory; however, in the CCC network,
each processor has its own memory, and is connected to at most four other processors. For details on the
CCC model, see [Chow (1980)j.

It follows from [Kozen and Yap (1985)] that most common computational geometry problems
including all the problems considered in this paper have, in principle, A^C-algorithms. Recall that they
combined the techniques of [Ben-Or, Kozen and Reif (1984)] and [Collins (1975)] to give parallel algo-
rithms for computing cell decompositions with adjacency information; for fixed dimensions, these algo-
rithms are in NC. The consequences of the cell decomposition algorithm for parallel computational

geometry are explored more carefully in [Yap (1987)]. Since any appeal to [Kozen and Yap (1985)]
involves a reduction to Tarski's language for real closed fields, and the methods are too general to be of
much practical use, this furnishes an existence proof for NC algorithms rather than furnishing practical NC

Many of the problems considered here are known to have Q(« log (n)) lower bounds in the algebraic
computation tree model [Ben-Or (1983)]. Therefore the goal in our research in such cases is to aim for
Q(log(n)) parallel time when 0{n) processors are available. Only in some instances do we present such
optimal algorithms, and the algorithms may require a preprocessing step which involves sorting. While
0{\og^{n)) sorting networks have been known for many years, to ensure optimality we may need to
invoke an optimal f)arallel sorting method ( log (n) parallel time). The AXS sorting network achieves these
bounds but with utterly unacceptable overhead (at the present state of knowledge: see [Ajtai, Koml6s and
Szemer6di (1983); Leighton (1984); Cole and 6'Dunlaing (1986)]). More recendy an optimal PRAM
algorithm has been described which has much lower overhead [Cole (1986)]. In our algorithms to compute
the VcHDnoi diagram for a planar point set and convex hull in three dimensions, we are able to avoid sort-
ing even though we only match the complexity bounds of Anita Chow.

Also we shall pay some attention to the processor allocation problem: a PRAM algorithm cannot be
considered completely described until one has a clear description of how the various tasks are to be allo-
cated among the available pool of processors. Such considerations usually become unimportant when the
time taken to complete a parallel step is logarithmic (or worse) in the number of processors involved, since
processor allocation is usually easy to accomplish in logarithmic parallel time.

Two standard methods of parallel algorithm design will be exploited freely throughout this paper
computing a running sum ('parallel prefix'), and list-ranking. To compute a running sum means to com-
pute the n initial sums of an array of n numbers. This is easily computed in logarithmic parallel time with
n processors, by imagining a balanced binary tree whose leaves are the array entries: at each internal node
the sum of the entries of all its leaf descendants can be computed (in an upward parallel sweep), and from
these internal sums all initial sums can be computed in a second phase. List-ranking involves traversing a
linked list and determining the rank (distaiKe from the end) of all its entries. This can be accomplished in
logarithmic parallel time by a px)inter-jumping technique; it is discussed briefly at the end of Section 3.

2. Definitions and terms.

As staled in the Introduction, the algorithms treated here will all belong to the class NC, but we shall
express them as PRAM algorithms running on a CREW machine. The notation

will indicate the class of algorithms running on a PRAM using /(n) processors and halting in 0(log*(/i))

steps. We are nainly interested in the class NC\{n), i.e. those using a linear number of processors. Note:
We use 'NC*' instead of 'NC to distinguish our model from the circuit model of machine-based complex-
ity. See [Yap (1987)].

We let E^ (respectively, E^) denote the Euclidean plane (respectively, Euclidean 3-space). In this
paper, we restrict our attentjon to £^ or £'. Given a (finite and nonempty) set S of points in E^ or £^, the
convex hull H{S) is the smallest convex set containing all points in 5. Sometimes by abuse of notation we

shall confuse the convex hull with the polygon (respectively, polyhedron) which constitutes its boundary.
We use the notation dH{S) to denote the polygonal boundary of H (S).

Given a finite nonempty set S of points in E^, its Voronoi diagram Vor(S) is defined as follows. For
each point p in S, define the Voronoi cell V{p) as the set of points z in £^ such that z is as close to p as to
any other point in S. Recall that the Voronoi cells are all nonempty closed convex sets with polygonal
boundaries and two cells can meet only along their common boundaries (i.e., the interior of every cell is
disjoint from all other cells). The Voronoi diagram Vor(S) is defined as the planar point-set formed from
the union of the boundaries of all the Voronoi cells; it is a planar graph with O (n) edges and vertices, all
the edges are straight line-segments (perhaps unbounded), and two cells can have at most one edge in com-

3. Optimal Parallel Convex Hull in the Plane

Fast parallel algorithms for the planar convex hull problem have been considered in the recent litera-
ture [Nalh et al. (1981); Chow (1981); Akl (1983); Chazelle (1984)]. The algorithms typically use a sim-
ple divide-and-conquer technique, recursively computing the convex hull of n points by solving two prob-
lems of half the size. It seems that any such technique requires Q.(\og^(n)) time. The main result in this
section is the following:

Theorem 1. The problem of computing the convex hull of a set of points in the plane is in NC^in).

We subsequendy learned that Atallah and Goodrich (1985), and Wagener (1985, 1987), have indepen-
dently used a similar technique to derive the same result.

From the well-known fact that sorting can be reduced to the planar convex hull problem, or using the
lower bounds for the algebraic computation tree model in [Ben-Or (1983)], it follows that the log(/i) time
is optimal for n processors. Actually the algorithm is optimal in a stronger sense: it foUows from the work
of Cook and Dwork (1982) that ( log (n)) time is the best possible even if we allow arbitrarily many pro-

Let 5 be the given set of n points. Use the pair of points on the convex hull // (S) with maximum and
minimum x-coordinate to partition H{S) into two parts in the natural way: the upper and the lower chains.
By symmetry, it is sufficient to show how to compute the upper chain in 0{\og{n)) time. First sort S
acccM"ding to the x-coordinate of the points, using (login)) time with n processcrs [Ajtai, Komlds and
Szemer^ (1983), Leighton (1984), Cole (1986)]. At this stage, covertical sets of points can be delected
and all but the highest from each set discarded in ( log (n)) steps. Our algorithm divides S into ^ sets of
^n points, recursively computes the upper convex chain of each set, and then merges all these chains
together in ( log (n)) time. The details of merging are non-trivial.

Let us see how to merge ^n upper chains in ( log (n)) steps with n processors. First consider two
upper chains C and C. Since we divide, S by x-coordinate, this ensures that the x-coordinales of any 2
chains do not overlap. Granted that their edges are presented in an array in sorted order, a single processor
can compute in 0(log(/i)) time the unique line which is tangent to both chains, together with the two

points of tangency. This was shown by Overmars and Van Leeuwen (1981). Since there are at most "

common tangents to be computed, n processors can compute all such tangents in time O(log(/i)). Note
that processor allocation raises no difficulty here since all the chains have a natural sequence (according to
their x-coordinates) and there is a natural sequence therefore on the set of all pairs of chains.

Let G denote the (combinatorial) graph whose vertices are the points in S and whose edges are the
edges in the upper chains, together with the line-segments defined by the tangent lines just discussed.
There are at most n-1 edges in all the upper chains, and they can be compressed into contiguous locations

in an array A by using a running-sum technique (in ( log (/i)) parallel time). The extra

^ 7t. Clearly, for each vertex v there can be at most one pair of edges (v,w) and (v.w) with this property,
and such vertices v can be detected in constant lime with one processor per edge in the array A. This cri-
terion identifies all edges and vertices on the upper chain, and perhaps some others, except far the first and
last edges on the upper chain: these are, of course, the edges with maximum and minimum slopes respec-
tively leaving the leftmost and rightmost vertices. (Since S is presorted by x-coordinate it is easy to iden-
tify these two vertices.)

For all these triples u,v,w which indicate possible vertices of the upper chain, let us call v a 'candi-
date' and w its 'successor.' If v is not a candidate its successor is undefined, and the leftmost vertex has as
its successor the next vertex on the upper chain. As indicated above this is easily identified. This succes-
sor relation defines a forest of trees, in which the upper chain is represented by a braiKh connecting the
leftmost vertex to the rightmost This branch can be identified in logarithmic parallel time, one processor
per vertex, using the parallel ranking technique described below.

To complete our inductive step, we now must collect the vertices of the upper chain of S into con-
secutive locations in an array. This is easily accomplished using parallel prefix, i.e., computing a running
sum on an array C(v) whose entry is 1 if v is in the upper chain and otherwise. This completes the merge
process of our algorithm.

Since the ranking problem is so basic and will be used again, it bears repeating here. Here we con-
sider the successor relation as defining a forest of trees directed from child to parent, and the rank of a node
in this forest is its distance from its root ancestor. In view of a later application, in addition to the rank
information, we compute an additional ( log (n)) pointers for each vertex:

(*) In log(n) parallel time, each node can leam its rank and also for each i < log(n), it also holds a

pointer to its 2'-th successor, if it has such a successor.
This is done in two phases, each phase taking log (n) parallel time. In the first phase, by the well-known
doubling technique, every vertex determines for all j < log(;i) whether it has a 2'-th successor, and if so,
which. In the second phase, at stage / (there are log(n) stages) all vertices with rank between 2' and
2'*'-l leam their exact rank. It is easy to implement all this on a CREW PRAM in logarithmic time.

The runtime T{n) of this convex hull algorithm, for n points with n processors, satisfies the relation

which implies T{n) is 0(log(n)).

4. Voronoi Diagram and Proximity Problems

One of the fundamental data structures of computational geometry is the Voronoi diagram [Shamos
(1977)]. The first optimal sequential algorithm to compute the Voronoi diagram was given by Shamos and
Hoey (1975) and it takes 0(n log n) lime. In this section, we show

Theorem 2. There is an NCjin) algorithm for the planar Voronoi diagram of a set of points. Further-
more, the algorithm only assumes an A'C 2(1) sorting algorithm.

Shamos (1977) also pointed out that the Voronoi diagram can be used as a powerful tool to give efficient
algorithms for a wide variety of other geometric proximity problems. From the Vc^onoi diagram we can
easily obtain the closest pair of sites (points), or the closest neighbor of each site, or the Euclidean
minimum spanning tree of the n sites, or the largest point-free circle with center inside their convex hull,
etcetera. Efficient reductions for these problems are possible in our parallel computation model, so our
polylog result for the Voronoi also implies polylog algorithms for all these problems, using a linear number
of processors. Note that there is a well-known reduction of the Voronoi diagram of a set of points in the
plane to the convex hull of a set of points in 3-space [Brown (1979); Guibas and Stolfi (1983)]. Also, in
her thesis, Anita Chow (1980) shows that the three-dimensional convex hull can be computed in NC'2(.n),
and exploits this to give an 0(log^(n)) algorithm to compute the diagram. However, her methods fre-
quently involve optimal time parallel sorting', which the following direct method avoids. In our FOCS
extended abstract we gave an NCjin) algorithm which avoids optimal time parallel sort. In the present
paper, we improve this to/^C2(n) by exploiting a simple observation suggested by Goodrich.

The present algorithm for computing the Voronoi diagram is modelled after the sequential method
first presented by Shamos and Hoey [1975]. This is a divide-and-conquer method in which two Voronoi
diagrams for the respective sets P and Q of points (where these sets are vertically separated, i.e., there is a
vertical line L which has all of f to its left and all of Q to its right) are merged in Unear lime to a diagram
for P uQ. The merging is defined by a polygonal path (the 'contour' or 'seam') separating P from Q; the
process is akin to tracing a line and seems inherently sequential. The contour separates the plane into the
set of points closer to P than to Q (i.e., the set of points x whose closest point or points in 5 are all in /") and

' The mmime bound given in [Chow (1980)] diffen from the bound quoted here by an additional factor of log log (n).
These discrepancies ariM wherever sorting is involved. The 0{log(rt)log log(/t)) parallel sorting algorithm in [Valiant
(1975)1 has since been superseded by theoretically optimal algorithms.

those closer to Q than to P.

The following illustrates the contour between two hypothetical sets P and Q:


Figure 1 . Merging by contour-tracing
Our goal is to parallelize this contour tracing. Let us introduce some suitable terminology. The con-
vex hull H (S) has already been defined. The normals of S are defined as the half-lines extending from the
comers of H{S) and perpendicular to the sides incident to those comers: two normals extend fit>m each
cwner. These normals partition the complement of //(S), E^~H{S), into sectors, which are alternately
slices (V-shaped) and snips (bounded by a side oi H(S) and the two incident normals). The following pic-
ture clarifies this terminology:


Figure 2. Normals and sectors (slices and strips)

The points of intersection of Vor{S) with the convex polygon dH (5) we call the cuipoints (figure 3). Let U
be the strip defined by an edge e of the convex hull. The set t/ n Vor{S) we call the e-promoniory. We


will assume thai S is in general position (no three points collinear, no four concyclic). Together with the
convexity of the Voronoi cells, this implies that the promontory has the natural structure of a binary tree
with leaves at the cutpoints and such that the parent of a node is always more distant from e. Beyond the
root of this binary tree extends an infinite ray that bisects the strip.

Figure 3. Promontory: cutpoints are indicated by (•)
Point location inside strips is easily done in (log^(/:)) serial time. We now modify a suggestion of

Goodrich to show that O ( log (n)) serial time suffices.

Lemma 3. Suppose a strip has an e-promontory with k vertices and edges. There is ak processor
0(log()t)) time parallel algorithm for constructing a data-structure for the plane partition defined by the
e-promontory so that in ( log {k)) time a single processor can answer a point location query.

Proof. The e-promontory is a complete binary tree laid out in the plane. We may assume that each leaf is
on the j-axis, and for every node v all its descendants have smaller >'-coordinate. For any node v of the
promontory, let Ty denote the subtree of the c-promontory rooted at v. Each internal node v of the e-
promonlory tree may be associated with a triangular region R^ with v as one comer of the triangle and the
two other comers the extreme cutpoints below the subtree at v. By convexity of Voronoi cells, the entire
subtree T, is contained in /?,. Suppose p is a query point inside the strip. Let r be the root of the e-
promontory. The interesting case is when p lies inside R,, and this can be checked in constant time. We
can now do a binary search inside R, by locating a vertex v such that T, contains between 1/3 and 2/3 of
the vertices of T,.. If p is inside 7,, we can recursively do point location for p inside this subtree. Other-
wise, let we next search inside a modified version of T/. the modified tree T^ consists of replacing T, by
three vertices corresponding to the comers of the triangle /?,, together with the two edges from v to the
other two comers. This recursive search can be carried out in ( log {k)) serial time.

It remains to construct a search structure D, for the tree T,. The search stmcture D^ consists of a
binary tree whose internal nodes are associated with triangular regions of the form /?, which we use for
comparisons: if we are at some internal node z of D„ we next go to the right (resp. left) child of z if query
point p lies inside (resp. outside) the triangular region of z. To construct D^ associated with T^, we do an
important preprocessing step where for each node v we determine its preorder rank rank(v) in the tree and
from this, we easily deduce maxrank(v), the maximum rank of its descendants. This can be done using k
processors and 0(log(Jk)) time [Taijan and Vishkin (1985)].^ We will deduce two important pieces of

^ Their CRCW parallel RAM algorithm can be adapted for our CREW version.

information from these computed values. First, the number desc (v) of descendants of a node v is given by
\+maxrank{vy-rank{v). Second, we can decide in constant time if any given vertex u is a descendant of
another v. More precisely, u is a descendant of v iff

rank (v) < rank (u) < maxrank (v).
Inductively, suppose we have to construct a search structure D^ for the subtree Tr rooted at some node r.
We assume that desc (v) is precompuled for each v in T,. In particular, let k = desc{r) be the number of
nodes in T,. Assigning one processor to each vertex of T,. in (1) steps, we can locate the deepest node of
T, with at least 2jt/3 descendants, and choose the node vq to be that one of its children with the larger
number of descendants (breaking ties arbitrarily). It is easy to see that vq will have between i/3 and 2fc/3
descendants. The root of D, will be associated with the region R^. The right subtree of (the root oO Dr
will be recursively constructed from 7,^: note that the precomputed information desc (v) icx each node of
r,g is available with no further work. Let T,. be the modified tree obtained by pruning r,g as described
above. We want to recursively construct the left subtree of D, from T^. But first we must revise the
precomputed information desc{v) for each node for T^. To do this, we only need to update the value of

1 3 4 5 6

Online LibraryA AggarwalParallel computational geometry → online text (page 1 of 6)