Реферат: Алгоритмы

Главная » Рефераты

Разделы

Главная

Естествознание

Уголовное право уголовный процесс

Оккультизм и уфология

Литература языковедение

Маркетинг товароведение реклама

Краеведение и этнография

Кулинария и продукты питания

Информация

Реферат: Алгоритмы

The Efficiency Of Algorithms.

Some mathematical problems can be solved only by methods too slow for even

the fastest computers. More efficient methods have not been found, but

neither has it been proved, that there are no better methods by Harry R.Lewis

and Christos H. Paradimitrou

Suppose you were asked to plan an itinerary for a traveling salesman who must

visit a number of cities. You are given a map on which the distances between

the cities are marked and you are asked to find the shortest route that

passes through all the cities and returns to the starting point. An approach

to this problem that is certain to give the correct answer is to trace all

the possible routes, measure their length and pick the shortest one. If the

tour included more than a few cities, however, hundreds or thousands of

routes would have to be checked. If there were 100 cities, then even the

fastest computers would require weeks of calculation to find the shortest

path.

In the search for a quicker solution you might try some less rigorous

methods. One idea that seems reasonable is always to visit nearby cities

before going on to those farther away. You would soon discover, however, that

this procedure does not invariably yield the correct answer. Other shortcuts

also fail. In fact, the best methods known for solving the problem are not

much better than the obvious but laborious procedure of checking all the

possible itineraries. Mathematicians now suspect that this problem and many

others like it may forever remain beyond our ability to solve in any

efficient way. That speculation itself, however, is unconfirmed; although no

faster methods of solution have been found, neither has it been proved that

faster methods do not exist.

In the problem of the traveling salesman's tour it is not the solution for a

particular set of cities that is of the greatest importance but a general

method for finding the solution for any cities. Such a method is called an

algorithm: it is a precisely stated procedure or set of instructions that can

be applied in the same way to all instances of a problem. If the problem is

to be solved with the aid of a computer, an algorithm is indispensable,

because only those procedures that can be stated in the explicit and

unambiguous form of an algorithm can be presented to a computer. Instructions

that are vague or that rely on intuition are unacceptable.

An example of an algorithm is the procedure taught in the schools for the

subtraction of whole numbers. If each of the steps in this procedure is

applied correctly one at a time, the algorithm will always yield the correct

result. What is more, once the algorithm has been learned or stored in the

memory of a computer or embodied in the circuitry of an electronic

calculator, it can be applied to an infinite set of subtraction problems.

With this one algorithm the difference between any two whole numbers can be

determined.

In principle any problem for which an algorithm can be devised can be solved

mechanically. It may therefore seem surprising that there are problems for

which algorithms exist but for which we so far have no practical general

solution. The algorithms for solving these problems always give a correct

answer, but they often require an inordinate amount of time. The problem of

the traveling salesman's tour is among these intractable tasks.

The efficiency of computer algorithms is a topic of obvious practical

importance. It is also of interest in more formal areas of mathematics. There

are some problems in mathematics and logic for which no algorithm can ever be

written, and there are many others for which efficient, fast algorithms are

already known. Between these two groups is a third class of problems that can

always be solved in principle, but for which there are only inefficient (and

therefore largely unusable) algorithms. For some of these difficult problems

mathematicians have been able to demonstrate that efficient algorithms can

never be designed. For many of the most important problems, however, there is

only the suspicion that good algorithms are impossible.

A given problem can have more than one algorithm for its solution. For

example, children in Europe learn a procedure for subtraction slightly

different from the one taught in the U.S. Both of the subtraction algorithms,

however, give the same result in the same amount of time. That is not

invariably the case with different algorithms for solving the same problem.

One celebrated problem that can be solved by either a "fast" algorithm or a

"slow" one is the problem of the Koenigsberg bridges.

In the I8th-century German city of Koenigsberg (which is now the Russian city

of Kaliningrad) a park was built on the banks of the river Pregel and on two

islands in the river. Within the park seven bridges connected the islands and

the riverbanks. A popular puzzle of the time asked if it was possible to walk

through the park by a route that crossed each of the bridges once and only

once.

For the solution of the problem the size and shape of the islands and the

length of the bridges are immaterial: the only essential information is the

pattern of interconnections. This information can be presented compactly in

the mathematical structure known as a graph, which is merely a set of points

with lines drawn to join them. In the case of the Koenigsberg park each of

the riverbanks, and each of the islands is condensed to a single point, and

each of the bridges is represented by a line between two points. Thus the

graph consists of four points and seven lines. If the lines are labeled, any

path through the park can be specified by a simple listing of labels.

The obvious approach to the problem is to list all the paths that cross all

the bridges and to eliminate from consideration those that cross any bridge

more than once. This is the technique of exhaustive search, similar to the

one employed in the problem of the traveling salesman. When the mathematician

Leonard Euler was presented with the problem of the Koenigsberg bridges, he

recognized the limitations of the technique and found another method. In

recognition of his contribution a path that traverses each line of a graph

exactly once is now called an Eulerian path.

Euler wrote: "The particular problem of the seven bridges of Koenigsberg

could be solved by carefully tabulating all possible paths, thereby

ascertaining by inspection which of them, if any, met the requirement. This

method of solution, however, is too tedious and too difficult because of the

large number of possible combinations, and in other problems where many more

bridges are involved it could not be used at all."

Euler's alternative method is much simpler. He showed that a tour of the kind

sought must exist if the graph meets two conditions. First, it must be

possible to go from any point in the graph to any other point by following

the lines of the graph: in other words, the graph may not be disconnected.

Second, every point of the graph, with two possible exceptions, must be at

the junction of an even number of lines.

It is not hard to understand why a graph cannot have an Eulerian path unless

it meets these conditions. All regions of the graph must be connected to one

another if there is to be any path that traverses all the lines. Each point

must have an even number of lines because half of them are required to reach

the point and the other half to leave it. Two points with an odd number of

lines can be allowed if they are chosen as the starting and finishing points

of the path. Demonstrating that any graph that meets these conditions

actually has an Eulerian path requires a somewhat more complicated argument,

which we shall not present here, but Euler was able to give a rigorous proof.

It is an easy matter to express Euler's solution to this problem in an

algorithm that could be executed by a computer. The first requirement,

connectivity, can be established by marking some point of the graph, then

similarly marking all points connected to it by lines, then all the points

connected to the newly marked points, and so on. The graph is connected if at

the end all points have been marked. The second requirement is just as easily

tested: the machine is instructed to examine each point of the graph and

count the number of lines that terminate at that point. If no more than two

of the points have an odd number of lines, the graph has an Eulerian path.

The park at Koenigsberg met the first condition but not the second, and so

there was no Eulerian tour of the seven bridges.

Euler's method is unquestionably the more economical approach to the problem

of Koenigsberg bridges: it requires that each point and line of the graph be

listed just once, whereas the exhaustive search is not completed until every

path that crosses all the bridges has been listed. The number of such paths

is much larger than the number of points and lines in the graph. In that

sense Euler's method is the better algorithm, but how much better? How can

the difference be measured and how can one fell if the difference is

significant?

For a graph with only four points and seven lines both techniques are fast

enough to be considered practical. Suppose, however, that more islands and

bridges were added to the park, or in other words that more points and lines

were added to graph. If the problem is being solved by Euler's method, each

new point merely adds one item to the list of points that must be checked. If

the paths are to be examined by exhaustive search, on the other hand, then

with each new point and line the size of the list is multiplied by some

factor. A moderate increase in the size of the graph results in an explosive

increase in the number of paths. Ultimately the list of paths 'must become

prohibitively long.

In this comparison of the two solutions to Euler's problem there is the basis

for a completely general method of evaluating the speed or practicality or

efficiency of any algorithm. We imagine that the algorithm is supplied with

larger and larger inputs, and we note the rate at which the execution time of

the algorithm increases. In this way it is possible to make unequivocal

judgments of algorithms. Exhaustive search not only is a slower method; in

general it is too slow to be of any value. Euler's method remains practical

for problems of essentially unlimited size.

As the size of the graphs being examined increases, the lists produced by the

method of exhaustive search grow exponentially. Each time some fixed number of

points and lines are added to the graph the size of the list doubles. Growth of

this kind can be described by a mathematical function such as 2^n,

where n is some measure of the size of the graph. Many other functions

have similar or even higher rates of growth. Among them are n^n and

n! ( which is read as "n factorial" and signifies n multiplied by

all the integers between 1 and n). For the purposes of this

discussion all these functions can be regarded as having the same property of

exponential growth.

Mathematical functions of another kind are known as polynomials. The simplest

members of this class are linear functions, such as 3n, which

designate a simple relation of proportionality. The time needed to solve the

problem of Koenigsberg bridges by Euler's method increases as a linear function

of the size of the graph. Other polynomials are n^2, n^3 and so

on, and the sums of such functions. What distinguishes polynomials from

exponential functions is that n never appears in an exponent.

For sufficiently large values of n any exponential function will

overtake and exceed any polynomial function. It was the certainty of this

result that dismayed 'Ihomas Malthus when he compared the exponential rate of

population growth with the polynomial rate of increase in the food supply. For

small values of n a given polynomial function may well exceed a given

exponential one, but there is always a value of n beyond which the

exponential function is the greater. The exact form of the polynomial makes

little difference, except in changing the point at which the polynomial

function is overtaken.

It is now generally agreed by computer scientists that algorithms whose

execution time increases exponentially as a function of the size of the input

are not of practical value. We shall call algorithms of this kind

"exponential time" algorithms, or simply inefficient algorithms. The only

algorithms that are considered fast or efficient enough for general

application are "polynomial time" algorithms.

Of course, even among efficient algorithms some are faster than others, but

for the purposes of this discussion it is important only to distinguish

polynomial-time algorithms as a class from exponential-time algorithms.

Moreover, this system of classification has the advantage that it makes the

speed of an algorithm a property of the algorithm itself and independent of

the machine on which it is executed. For sufficiently large problems a

polynomial-time algorithm executed on even the slowest machine will find an

answer sooner than an exponential-time algorithm on the fastest computer.

The mathematical properties of algorithms were studied in the 1930's by the

British mathematician A. M. Turing, the inventor of the imaginary computer,

now called a Turing machine. The Turing machine was conceived to be an

automaton equipped with an infinite supply of paper tape marked off in square

regions. The machine was capable of just four actions: it could move the tape

one square, it could place a mark in a square, it could erase a mark already

present and at the end of a calculation it could halt. These operations were

to be performed according to a sequence of instructions built into the

internal mechanism. Of course, Turing never built such a machine; it was

merely a conceptual device for automatically solving problems in mathematics

and logic. Indeed, Turing was interested not in actually solving problems

with the machine but rather in investigating what kinds of problems it could

solve and what kinds it could not.

Turing discovered that even a machine as simple as this one could solve any

problem for which an algorithm could be devised. The computation might be

laborious and indirect, but given enough time and paper tape the machine

would eventually find the solution and halt. Reduced to its essentials the

Turing machine is a language for stating algorithms, as powerful in principle

as the more sophisticated languages now employed for communicating with

computers.

In addition to conceiving, these machines Turing demonstrated, their

limitations. In 1936 he showed that there are problems that cannot be solved

by Turing machines, and it follows that these problems cannot be solved by

any automatic computer. They are problems for which algorithms cannot be

written, even in principle. The example first studied by Turing is the

problem of predicting whether a particular Turing machine, once it is set in

motion, will ever finish its calculation and halt. Through an analysis of

this problem he was able to show that there can be no general procedure for

telling whether mathematical propositions are true or false. Since then a

variety of other problems with the same properties have been proposed.

One result of Turing's work was to divide all imaginable problems in

mathematics into two classes. Those problems for which algorithms can never

be written are in a formal sense permanently unsolvable. Some instances of

these problems may be solved by rare perception or by luck, but a general

method for their solution will never be found.

All other problems in mathematics and logic can be solved by algorithms. As

we have seen, however, some algorithms are more useful than others. The class

of solvable problems can therefore be divided into two subgroups: those for

which there are efficient, polynomial-time algorithms and those for which

there are only exponential-time algorithms. Euler's problem is a member of

the class with polynomial-time solutions, since Euler's method is itself a

polynomial-time algorithm. Problems that can be proved to have only

exponential-time algorithms are also known, although they are rather obscure.

Although these two groups of problems are quite distinct, it is not always a

straightforward task to assign a problem to one group or the other. Indeed, a

very interesting class of problems seems to fall somewhere between them. For

these problems no efficient algorithms are known and the best available

solutions require exponentially increasing time, yet no one has been able to

prove that the problems do not have polynomial-time solutions.

One such problem was considered in the 19th century by the Irish

mathematician William Rowan Hamilton. Superficially Hamilton's problem is much

like Euler's .The problem is to decide whether a given graph has a path that

takes in each point exactly once (whereas Euler looked for a path that

traversed each line once). Actually the tasks are quite different, and Euler's

method cannot be applied to Hamilton's problem. The graph derived from the plan

of the park at Koenigsberg has a Hamiltonian path, although, as we have seen,

it has no Eulerian path. On the other hand, removing two lines results in a

graph that has an Eulerian path but not a Hamiltonian one. Many other graphs

have neither kind of path.

Hamilton's problem can be solved by exhaustive search: indeed, the procedure

is not substantially different from that employed in listing all the possible

paths that might have the Eulerian property. For Hamilton's problem, however,

no efficient algorithm comparable to Euler's method has been found. The

problem has been pondered by many of the best mathematicians of the past

century, but the most efficient methods available today are fundamentally no

better than exhaustive tabulation. On the other hand, all attempts to prove

that there is no better method have also failed, and it must be considered a

possibility that an efficient algorithm will be discovered tomorrow.

Problems that are known to have polynomial-time solutions, such as Euler's

problem, are said to be members of the class P (for polynomial). Hamilton's

problem is a member of another class, designated by the letters NP,

signifying "nondeterministic polynomial." The class NP encompasses all the

problems in P, or in other words P is a subset of NP. In addition NP includes

problems whose status is less certain. They are all solvable problems in

principle: they have algorithms, but for now only exponential-time algorithms

are known. They may also have polynomial-time algorithms (in which case NP

and P are identical) or they may prove to be permanently intractable, with

only inefficient solutions.

The problems considered here, and all problems classified in this way, can be

described as an infinite set of similar questions each of which can be

Another way of defining NP is as the class of yes-or-no problems that can be

solved by guessing certificates. If one is given an instance of a problem in

NP for which the answer happens to be yes, then with luck one may discover

the required certificate fairly quickly by making a sequence of guesses; if

the answer is no, guessing cannot possibly yield an answer any faster than an

exhaustive search could. For example, in solving Hamilton's problem one might

find a correct path (if there is one) on the first try by tracing small

portions of the path and guessing at each stage how to proceed. Such a

procedure, it should be emphasized, is not an algorithm. It could be made

into an algorithm only by crossing off each trial path as it is tested and

checking all possible paths, but that is equivalent to the method of

exhaustive search.

A mathematical procedure defined in terms of luckly guesses may seem bizarre,

but it is a quite legitimate approach to defining the problems in the class NP.

In principle the procedure could even be mechanized by building a device called

a nondeterministic Turing machine. This device can do all that an ordinary

Turing machine can do; in addition, at some points in its operation it

may have more than one choice of what to do next. Such a machine would be

considered to answer yes to a question if there were some sequence of choices

that could lead it to a yes conclusion. NP, the class of nondeterministic

polynomial-time problems, consists of precisely those problems whose yes

instances can be identified by, machines making comparatively short guessing

computations.

The inclusion of guessing in the definition of these problems suggests strongly

to many mathematicians that P and NP are not the same set and

hence that efficient algorithms can never be found for the intractable problems

in the class NP. If every problem in NP were actually in P,

then all the guesswork and luck could be replaced by some systematic procedure

without great sacrifice in time. It is hard to believe the ability to guess and

be lucky could win so little.

The class NP includes a variety of commonly encountered problems that

seem to defy efficient solution. We have already mentioned Hamilton's problem

and the problem of composite numbers. Another example is known as the matching

problem. It can be considered in terms of the task faced by the colleges every

September, when a new class of freshmen must be assigned to shared dormitory

rooms.

For the sake of simplicity let us assume that all the information gathered

about the students' smoking habits, bed time hours, taste in music and so

forth results in a single yes-or-no decision as to the compatibility of each

possible pair of students. The entire class can then be represented as a

graph in which the points correspond to students and a line is drawn answered

yes or no. For problems that are formally unsolvable, such as the problem of

predicting whether a Turing machine will halt, these questions simply cannot

be answered by any algorithmic procedure. For problems of the class P the

questions can invariably be answered, whether the answer turns out to be yes

or no, by an efficient procedure. In order for a problem to qualify for the

class NP there need not be an efficient means of answering the yes-or-no

questions. What is required is that whenever the answer is yes there be a

short and convincing argument proving it.

Hamilton's problem, for example, meets this condition. It is not possible to

tell by any efficient means known today whether a graph has a Hamiltonian

path, but if it does, then the path itself can be exhibited. Hence for every

Hamiltonian graph it is possible to issue a "certificate" that proves its

membership in this special class of graphs. Such a certificate would name the

lines in the graph in the order the path traverses them. Finding the path

might require weeks of tabulation, but once it has been found it can easily

be exhibited. Another problem that belongs to the class NP is the question of

whether a whole number is composite, that is, whether it can be written as

the product of two other numbers. Again there is no known efficient procedure

for answering the question, but if the number is indeed composite there is a

succinct proof of that fact, namely a correctly worked-out multiplication

with the number on the bottom line.

Care must be taken when asking the yes-or-no question of a problem in the

class NP, since the complementary no-or-yes problem might not be in the same

class. For example, the complement of Hamilton's problem, in which one is

asked to show that a graph does not have a path passing once through each

point, may well not be in the class NP. For now the only way to demonstrate

the absence of such a path is to list all possible paths, and such a proof is

too lengthy to qualify as a certificate of membership in NP. On the other

hand, the complement of the composite-number problem, which asks if a number

is prime, turns out to be in the class NP. The reason, which is far from

obvious, is that relatively short proofs demonstrating that a number has no

factors other than 1 and itself were discovered in 1975 by Vaughan Pratt of

the Massachusetts Institute of Technology. Still, it is not known whether the

composite-number problem and its complement are in the class P.

It is easy to show that every problem in the class P is also in the class NP.

If a problem is in P, then by definition there is an efficient algorithm for

it. To produce a short and convincing proof that the answer to some instance

of the problem is yes, all we need to do is follow the algorithm: a record of

its operation constitutes the required certificate.