Algorithm

In mathematics and computer science, an algorithm is an unambiguous specification of how to solve a class of problems. Algorithms can perform calculation, data processing and automated reasoning tasks.

As an effective method, an algorithm can be expressed within a finite amount of space and time and in a well-defined formal language for calculating a function. Starting from an initial state and initial input (perhaps empty), the instructions describe a computation that, when executed, proceeds through a finite number of well-defined successive states, eventually producing “output” and terminating at a final ending state. The transition from one state to the next is not necessarily deterministic; some algorithms, known as randomized algorithms, incorporate random input.

The concept of algorithm has existed for centuries and the use of the concept can be ascribed to Greek mathematicians, e.g. the sieve of Eratosthenes and Euclid’s algorithm; the term algorithm itself derives from the 9th Century mathematician Muḥammad ibn Mūsā al’Khwārizmī, latinized ‘Algoritmi’. A partial formalization of what would become the modern notion of algorithm began with attempts to solve the Entscheidungsproblem (the “decision problem”) posed by David Hilbert in 1928. Subsequent formalizations were framed as attempts to define “effective calculability” or “effective method”; those formalizations included the Gödel–Herbrand–Kleene recursive functions of 1930, 1934 and 1935, Alonzo Church’s lambda calculus of 1936, Emil Post’s Formulation 1 of 1936, and Alan Turing’s Turing machines of 1936–7 and 1939.

Etymology
The word ‘algorithm’ has its roots in latinizing the name of Muhammad ibn Musa al-Khwarizmi in a first step to algorismus. Al-Khwārizmī (Persian: خوارزمی‎, c. 780–850) was a Persian mathematician, astronomer, geographer, and scholar in the House of Wisdom in Baghdad, whose name means ‘the native of Khwarezm’, a region that was part of Greater Iran and is now in Uzbekistan.

About 825, al-Khwarizmi wrote an Arabic language treatise on the Hindu–Arabic numeral system, which was translated into Latin during the 12th century under the title Algoritmi de numero Indorum. This title means “Algoritmi on the numbers of the Indians”, where “Algoritmi” was the translator’s Latinization of Al-Khwarizmi’s name. Al-Khwarizmi was the most widely read mathematician in Europe in the late Middle Ages, primarily through another of his books, the Algebra. In late medieval Latin, algorismus, English ‘algorism’, the corruption of his name, simply meant the “decimal number system”. In the 15th century, under the influence of the Greek word ἀριθμός ‘number’ (cf. ‘arithmetic’), the Latin word was altered to algorithmus, and the corresponding English term ‘algorithm’ is first attested in the 17th century; the modern sense was introduced in the 19th century.

In English, it was first used in about 1230 and then by Chaucer in 1391. English adopted the French term, but it wasn’t until the late 19th century that “algorithm” took on the meaning that it has in modern English.

Another early use of the word is from 1240, in a manual titled Carmen de Algorismo composed by Alexandre de Villedieu. It begins thus:

Haec algorismus ars praesens dicitur, in qua / Talibus Indorum fruimur bis quinque figuris.

which translates as:

Algorism is the art by which at present we use those Indian figures, which number two times five.

The poem is a few hundred lines long and summarizes the art of calculating with the new style of Indian dice, or Talibus Indorum, or Hindu numerals.

Informal definition
For a detailed presentation of the various points of view on the definition of “algorithm”, see Algorithm characterizations.
An informal definition could be “a set of rules that precisely defines a sequence of operations.” which would include all computer programs, including programs that do not perform numeric calculations. Generally, a program is only an algorithm if it stops eventually.

A prototypical example of an algorithm is the Euclidean algorithm to determine the maximum common divisor of two integers; an example (there are others) is described by the flowchart above and as an example in a later section.

Boolos, Jeffrey & 1974, 1999 offer an informal meaning of the word in the following quotation:

No human being can write fast enough, or long enough, or small enough† ( †”smaller and smaller without limit …you’d be trying to write on molecules, on atoms, on electrons”) to list all members of an enumerably infinite set by writing out their names, one after another, in some notation. But humans can do something equally useful, in the case of certain enumerably infinite sets: They can give explicit instructions for determining the nth member of the set, for arbitrary finite n. Such instructions are to be given quite explicitly, in a form in which they could be followed by a computing machine, or by a human who is capable of carrying out only very elementary operations on symbols.

An “enumerably infinite set” is one whose elements can be put into one-to-one correspondence with the integers. Thus, Boolos and Jeffrey are saying that an algorithm implies instructions for a process that “creates” output integers from an arbitrary “input” integer or integers that, in theory, can be arbitrarily large. Thus an algorithm can be an algebraic equation such as y = m + n – two arbitrary “input variables” m and n that produce an output y. But various authors’ attempts to define the notion indicate that the word implies much more than this, something on the order of (for the addition example):

Precise instructions (in language understood by “the computer”) for a fast, efficient, “good” process that specifies the “moves” of “the computer” (machine or human, equipped with the necessary internally contained information and capabilities) to find, decode, and then process arbitrary input integers/symbols m and n, symbols + and = … and “effectively” produce, in a “reasonable” time, output-integer y at a specified place and in a specified format.
The concept of algorithm is also used to define the notion of decidability. That notion is central for explaining how formal systems come into being starting from a small set of axioms and rules. In logic, the time that an algorithm requires to complete cannot be measured, as it is not apparently related with our customary physical dimension. From such uncertainties, that characterize ongoing work, stems the unavailability of a definition of algorithm that suits both concrete (in some sense) and abstract usage of the term.

Formalization
Algorithms are essential to the way computers process data. Many computer programs contain algorithms that detail the specific instructions a computer should perform (in a specific order) to carry out a specified task, such as calculating employees’ paychecks or printing students’ report cards. Thus, an algorithm can be considered to be any sequence of operations that can be simulated by a Turing-complete system. Authors who assert this thesis include Minsky (1967), Savage (1987) and Gurevich (2000):

Minsky: “But we will also maintain, with Turing . . . that any procedure which could “naturally” be called effective, can in fact be realized by a (simple) machine. Although this may seem extreme, the arguments . . . in its favor are hard to refute”.

Gurevich: “…Turing’s informal argument in favor of his thesis justifies a stronger thesis: every algorithm can be simulated by a Turing machine … according to Savage [1987], an algorithm is a computational process defined by a Turing machine”.

Typically, when an algorithm is associated with processing information, data can be read from an input source, written to an output device and stored for further processing. Stored data are regarded as part of the internal state of the entity performing the algorithm. In practice, the state is stored in one or more data structures.

For some such computational process, the algorithm must be rigorously defined: specified in the way it applies in all possible circumstances that could arise. That is, any conditional steps must be systematically dealt with, case-by-case; the criteria for each case must be clear (and computable).

Because an algorithm is a precise list of precise steps, the order of computation is always crucial to the functioning of the algorithm. Instructions are usually assumed to be listed explicitly, and are described as starting “from the top” and going “down to the bottom”, an idea that is described more formally by flow of control.

So far, this discussion of the formalization of an algorithm has assumed the premises of imperative programming. This is the most common conception, and it attempts to describe a task in discrete, “mechanical” means. Unique to this conception of formalized algorithms is the assignment operation, setting the value of a variable. It derives from the intuition of “memory” as a scratchpad. There is an example below of such an assignment.

For some alternate conceptions of what constitutes an algorithm see functional programming and logic programming.

Expressing algorithms
Algorithms can be expressed in many kinds of notation, including natural languages, pseudocode, flowcharts, drakon-charts, programming languages or control tables (processed by interpreters). Natural language expressions of algorithms tend to be verbose and ambiguous, and are rarely used for complex or technical algorithms. Pseudocode, flowcharts, drakon-charts and control tables are structured ways to express algorithms that avoid many of the ambiguities common in natural language statements. Programming languages are primarily intended for expressing algorithms in a form that can be executed by a computer, but are often used as a way to define or document algorithms.

There is a wide variety of representations possible and one can express a given Turing machine program as a sequence of machine tables (see more at finite-state machine, state transition table and control table), as flowcharts and drakon-charts (see more at state diagram), or as a form of rudimentary machine code or assembly code called “sets of quadruples” (see more at Turing machine).

Representations of algorithms can be classed into three accepted levels of Turing machine description:

1 High-level description
“…prose to describe an algorithm, ignoring the implementation details. At this level we do not need to mention how the machine manages its tape or head.”
2 Implementation description
“…prose used to define the way the Turing machine uses its head and the way that it stores data on its tape. At this level we do not give details of states or transition function.”
3 Formal description
Most detailed, “lowest level”, gives the Turing machine’s “state table”.
For an example of the simple algorithm “Add m+n” described in all three levels, see Algorithm#Examples.

Implementation
Most algorithms are intended to be implemented as computer programs. However, algorithms are also implemented by other means, such as in a biological neural network (for example, the human brain implementing arithmetic or an insect looking for food), in an electrical circuit, or in a mechanical device.

Computer algorithms
In computer systems, an algorithm is basically an instance of logic written in software by software developers to be effective for the intended “target” computer(s) to produce output from given (perhaps null) input. An optimal algorithm, even running in old hardware, would produce faster results than a non-optimal (higher time complexity) algorithm for the same purpose, running in more efficient hardware; that is why algorithms, like computer hardware, are considered technology.

“Elegant” (compact) programs, “good” (fast) programs : The notion of “simplicity and elegance” appears informally in Knuth and precisely in Chaitin:

Knuth: “. . .we want good algorithms in some loosely defined aesthetic sense. One criterion . . . is the length of time taken to perform the algorithm . . .. Other criteria are adaptability of the algorithm to computers, its simplicity and elegance, etc”
Chaitin: ” . . . a program is ‘elegant,’ by which I mean that it’s the smallest possible program for producing the output that it does”
Chaitin prefaces his definition with: “I’ll show you can’t prove that a program is ‘elegant'”—such a proof would solve the Halting problem (ibid).

Algorithm versus function computable by an algorithm: For a given function multiple algorithms may exist. This is true, even without expanding the available instruction set available to the programmer. Rogers observes that “It is . . . important to distinguish between the notion of algorithm, i.e. procedure and the notion of function computable by algorithm, i.e. mapping yielded by procedure. The same function may have several different algorithms”.

Unfortunately there may be a tradeoff between goodness (speed) and elegance (compactness)—an elegant program may take more steps to complete a computation than one less elegant. An example that uses Euclid’s algorithm appears below.

Computers (and computors), models of computation: A computer (or human “computor”) is a restricted type of machine, a “discrete deterministic mechanical device” that blindly follows its instructions. Melzak’s and Lambek’s primitive models reduced this notion to four elements: (i) discrete, distinguishable locations, (ii) discrete, indistinguishable counters (iii) an agent, and (iv) a list of instructions that are effective relative to the capability of the agent.

Simulation of an algorithm: computer (computor) language: Knuth advises the reader that “the best way to learn an algorithm is to try it . . . immediately take pen and paper and work through an example”. But what about a simulation or execution of the real thing? The programmer must translate the algorithm into a language that the simulator/computer/computor can effectively execute. Stone gives an example of this: when computing the roots of a quadratic equation the computor must know how to take a square root. If they don’t, then the algorithm, to be effective, must provide a set of rules for extracting a square root.

This means that the programmer must know a “language” that is effective relative to the target computing agent (computer/computor).

Algorithmic analysis
It is frequently important to know how much of a particular resource (such as time or storage) is theoretically required for a given algorithm. Methods have been developed for the analysis of algorithms to obtain such quantitative answers (estimates); for example, the sorting algorithm above has a time requirement of O(n), using the big O notation with n as the length of the list. At all times the algorithm only needs to remember two values: the largest number found so far, and its current position in the input list. Therefore, it is said to have a space requirement of O(1), if the space required to store the input numbers is not counted, or O(n) if it is counted.

Formal versus empirical
The analysis and study of algorithms is a discipline of computer science, and is often practiced abstractly without the use of a specific programming language or implementation. In this sense, algorithm analysis resembles other mathematical disciplines in that it focuses on the underlying properties of the algorithm and not on the specifics of any particular implementation. Usually pseudocode is used for analysis as it is the simplest and most general representation. However, ultimately, most algorithms are usually implemented on particular hardware / software platforms and their algorithmic efficiency is eventually put to the test using real code. For the solution of a “one off” problem, the efficiency of a particular algorithm may not have significant consequences (unless n is extremely large) but for algorithms designed for fast interactive, commercial or long life scientific usage it may be critical. Scaling from small n to large n frequently exposes inefficient algorithms that are otherwise benign.

Execution efficiency
To illustrate the potential improvements possible even in well established algorithms, a recent significant innovation, relating to FFT algorithms (used heavily in the field of image processing), can decrease processing time up to 1,000 times for applications like medical imaging. In general, speed improvements depend on special properties of the problem, which are very common in practical applications. Speedups of this magnitude enable computing devices that make extensive use of image processing (like digital cameras and medical equipment) to consume less power.

Classification
There are various ways to classify algorithms, each with its own merits.

By implementation
One way to classify algorithms is by implementation means.

Recursion
A recursive algorithm is one that invokes (makes reference to) itself repeatedly until a certain condition (also known as termination condition) matches, which is a method common to functional programming. Iterative algorithms use repetitive constructs like loops and sometimes additional data structures like stacks to solve the given problems. Some problems are naturally suited for one implementation or the other. For example, towers of Hanoi is well understood using recursive implementation. Every recursive version has an equivalent (but possibly more or less complex) iterative version, and vice versa.

Logical
An algorithm may be viewed as controlled logical deduction. This notion may be expressed as: Algorithm = logic + control. The logic component expresses the axioms that may be used in the computation and the control component determines the way in which deduction is applied to the axioms. This is the basis for the logic programming paradigm. In pure logic programming languages the control component is fixed and algorithms are specified by supplying only the logic component. The appeal of this approach is the elegant semantics: a change in the axioms has a well-defined change in the algorithm.

Serial, parallel or distributed
Algorithms are usually discussed with the assumption that computers execute one instruction of an algorithm at a time. Those computers are sometimes called serial computers. An algorithm designed for such an environment is called a serial algorithm, as opposed to parallel algorithms or distributed algorithms. Parallel algorithms take advantage of computer architectures where several processors can work on a problem at the same time, whereas distributed algorithms utilize multiple machines connected with a computer network. Parallel or distributed algorithms divide the problem into more symmetrical or asymmetrical subproblems and collect the results back together. The resource consumption in such algorithms is not only processor cycles on each processor but also the communication overhead between the processors. Some sorting algorithms can be parallelized efficiently, but their communication overhead is expensive. Iterative algorithms are generally parallelizable. Some problems have no parallel algorithms, and are called inherently serial problems.

Deterministic or non-deterministic
Deterministic algorithms solve the problem with exact decision at every step of the algorithm whereas non-deterministic algorithms solve problems via guessing although typical guesses are made more accurate through the use of heuristics.

Exact or approximate
While many algorithms reach an exact solution, approximation algorithms seek an approximation that is closer to the true solution. Approximation can be reached by either using a deterministic or a random strategy. Such algorithms have practical value for many hard problems. One of the examples of an approximate algorithm is the Knapsack problem. The Knapsack problem is a problem where there is a set of given items. The goal of the problem is to pack the knapsack to get the maximum total value. Each item has some weight and some value. Total weight that we can carry is no more than some fixed number X. So, we must consider weights of items as well as their value.

Quantum algorithm
They run on a realistic model of quantum computation. The term is usually used for those algorithms which seem inherently quantum, or use some essential feature of quantum computation such as quantum superposition or quantum entanglement.

By design paradigm
Another way of classifying algorithms is by their design methodology or paradigm. There is a certain number of paradigms, each different from the other. Furthermore, each of these categories include many different types of algorithms. Some common paradigms are:

Brute-force or exhaustive search
This is the naive method of trying every possible solution to see which is best.

Divide and conquer
A divide and conquer algorithm repeatedly reduces an instance of a problem to one or more smaller instances of the same problem (usually recursively) until the instances are small enough to solve easily. One such example of divide and conquer is merge sorting. Sorting can be done on each segment of data after dividing data into segments and sorting of entire data can be obtained in the conquer phase by merging the segments. A simpler variant of divide and conquer is called a decrease and conquer algorithm, that solves an identical subproblem and uses the solution of this subproblem to solve the bigger problem. Divide and conquer divides the problem into multiple subproblems and so the conquer stage is more complex than decrease and conquer algorithms. An example of decrease and conquer algorithm is the binary search algorithm.

Search and enumeration
Many problems (such as playing chess) can be modeled as problems on graphs. A graph exploration algorithm specifies rules for moving around a graph and is useful for such problems. This category also includes search algorithms, branch and bound enumeration and backtracking.
Randomized algorithm

Reduction of complexity
This technique involves solving a difficult problem by transforming it into a better known problem for which we have (hopefully) asymptotically optimal algorithms. The goal is to find a reducing algorithm whose complexity is not dominated by the resulting reduced algorithm’s. For example, one selection algorithm for finding the median in an unsorted list involves first sorting the list (the expensive portion) and then pulling out the middle element in the sorted list (the cheap portion). This technique is also known as transform and conquer.

Optimization problems
For optimization problems there is a more specific classification of algorithms; an algorithm for such problems may fall into one or more of the general categories described above as well as into one of the following:

Linear programming
When searching for optimal solutions to a linear function bound to linear equality and inequality constraints, the constraints of the problem can be used directly in producing the optimal solutions. There are algorithms that can solve any problem in this category, such as the popular simplex algorithm. Problems that can be solved with linear programming include the maximum flow problem for directed graphs. If a problem additionally requires that one or more of the unknowns must be an integer then it is classified in integer programming. A linear programming algorithm can solve such a problem if it can be proved that all restrictions for integer values are superficial, i.e., the solutions satisfy these restrictions anyway. In the general case, a specialized algorithm or an algorithm that finds approximate solutions is used, depending on the difficulty of the problem.
Dynamic programming

When a problem shows optimal substructures – meaning the optimal solution to a problem can be constructed from optimal solutions to subproblems – and overlapping subproblems, meaning the same subproblems are used to solve many different problem instances, a quicker approach called dynamic programming avoids recomputing solutions that have already been computed. For example, Floyd–Warshall algorithm, the shortest path to a goal from a vertex in a weighted graph can be found by using the shortest path to the goal from all adjacent vertices. Dynamic programming and memoization go together. The main difference between dynamic programming and divide and conquer is that subproblems are more or less independent in divide and conquer, whereas subproblems overlap in dynamic programming. The difference between dynamic programming and straightforward recursion is in caching or memoization of recursive calls. When subproblems are independent and there is no repetition, memoization does not help; hence dynamic programming is not a solution for all complex problems. By using memoization or maintaining a table of subproblems already solved, dynamic programming reduces the exponential nature of many problems to polynomial complexity.

The greedy method
A greedy algorithm is similar to a dynamic programming algorithm in that it works by examining substructures, in this case not of the problem but of a given solution. Such algorithms start with some solution, which may be given or have been constructed in some way, and improve it by making small modifications. For some problems they can find the optimal solution while for others they stop at local optima, that is, at solutions that cannot be improved by the algorithm but are not optimum. The most popular use of greedy algorithms is for finding the minimal spanning tree where finding the optimal solution is possible with this method. Huffman Tree, Kruskal, Prim, Sollin are greedy algorithms that can solve this optimization problem.

The heuristic method
In optimization problems, heuristic algorithms can be used to find a solution close to the optimal solution in cases where finding the optimal solution is impractical. These algorithms work by getting closer and closer to the optimal solution as they progress. In principle, if run for an infinite amount of time, they will find the optimal solution. Their merit is that they can find a solution very close to the optimal solution in a relatively short time. Such algorithms include local search, tabu search, simulated annealing, and genetic algorithms. Some of them, like simulated annealing, are non-deterministic algorithms while others, like tabu search, are deterministic. When a bound on the error of the non-optimal solution is known, the algorithm is further categorized as an approximation algorithm.

By field of study
Every field of science has its own problems and needs efficient algorithms. Related problems in one field are often studied together. Some example classes are search algorithms, sorting algorithms, merge algorithms, numerical algorithms, graph algorithms, string algorithms, computational geometric algorithms, combinatorial algorithms, medical algorithms, machine learning, cryptography, data compression algorithms and parsing techniques.

Fields tend to overlap with each other, and algorithm advances in one field may improve those of other, sometimes completely unrelated, fields. For example, dynamic programming was invented for optimization of resource consumption in industry, but is now used in solving a broad range of problems in many fields.

By complexity
Algorithms can be classified by the amount of time they need to complete compared to their input size:

Constant time: if the time needed by the algorithm is the same, regardless of the input size. E.g. an access to an array element.
Linear time: if the time is proportional to the input size. E.g. the traverse of a list.
Logarithmic time: if the time is a logarithmic function of the input size. E.g. binary search algorithm.
Polynomial time: if the time is a power of the input size. E.g. the bubble sort algorithm has quadratic time complexity.
Exponential time: if the time is an exponential function of the input size. E.g. Brute-force search.
Some problems may have multiple algorithms of differing complexity, while other problems might have no algorithms or no known efficient algorithms. There are also mappings from some problems to other problems. Owing to this, it was found to be more suitable to classify the problems themselves instead of the algorithms into equivalence classes based on the complexity of the best possible algorithms for them.

Continuous algorithms
The adjective “continuous” when applied to the word “algorithm” can mean:

An algorithm operating on data that represents continuous quantities, even though this data is represented by discrete approximations—such algorithms are studied in numerical analysis; or
An algorithm in the form of a differential equation that operates continuously on the data, running on an analog computer.

Source from Wikipedia