Everything you ever wanted to know about algorithms
- — 25 March, 2008 09:03
"As the mind learns to understand more complicated combinations of ideas, simpler formulae soon reduce their complexity." -Antoine-Nicholas de Condorcet, 1794
The word algorithm was derived from the name Al-Khwarizmi, a 9th-century Persian mathematician and author of The Compendious Book on Calculation by Completion and Balancing. But nowadays the word most often applies to a step-by-step procedure for solving a problem with a computer.
An algorithm is like a recipe, with a discrete beginning and end and a prescribed sequence of steps leading unambiguously to some desired result.
But coming up with the right answer at the end of a program is only the minimum requirement. The best algorithms also run fast, are sparing in their use of memory and other computer resources, and are easy to understand and modify. The very best ones are invariably called "elegant," although Al-Khwarizmi may not have used that term for his formulas for solving quadratic equations.
An algorithm can be thought of as the link between the programming language and the application. It's the way we tell a Cobol compiler how to generate a payroll system, for example.
Although algorithms can end up as thousands of lines of computer code, they often start as very high-level abstractions, the kind an analyst might hand to a programmer.
For example, a lengthy routine in that payroll system might have started out with this algorithmic specification: "Look up the employee's name in the Employee Table. If it is not there, print the message, 'Invalid employee.' If all other data on the input record is valid, go to the routine that computes net pay from gross pay. Repeat these steps for each employee. Then go to the routine that prints checks." The gross-to-net and check-writing routines would have their own algorithms.
Of course, it isn't quite that simple. If it were, the study of algorithms would not have become a major branch of computer science and the subject of countless books and doctoral theses.
But it's not hard to imagine computer engineers in the 1950s thinking they had pretty much finished the job. They had invented stored-program electronic computers, and languages like Fortran and Cobol to run on them, and they had largely banished the agony of assembly language programming. In fact, software pioneers such as Grace Hopper saw compilers, and the algorithms that instructed them, as such an advancement -- they could "understand" English -- that they named the first computer to use one the Universal Automatic Computer, or Univac. With adjectives like "universal" and "automatic" in its name, the computer could almost be expected to program itself.
But in the 1960s, computers moved into the business world in a big way, and soon two ugly realities intruded. The first was the matter of "bugs" -- a term coined by Hopper. Computers made lots of mistakes because programmers made lots of mistakes. The second was sorting, a machine-intensive job that came to dominate, and sometimes overwhelm, computing.
Virtually every major application required sorting. For example, if you wanted to eliminate duplicate mailings from your customer master file, which was sorted by customer number, you might have had to re-sort it by last name within ZIP code. Sorting and merging big files often went on repeatedly throughout the day. Even worse, very few of the records being sorted would fit into those tiny memories, and often they were not even on disk; they were on slow, cumbersome magnetic tapes. When the CEO called the data processing shop and asked, "When can I get that special report?" the DP guy might have said it would take 24 hours because of all the sorting that was needed.
So IT people learned that algorithms mattered. The choice of algorithm could have a huge effect on both programmability and processing efficiency.
If algorithms were simple, they could be easily coded, debugged and later modified. Simple ones were less likely to have bugs in the first place, and if you used an existing algorithm rather than inventing your own, some of the debugging had already been done. But simple ones were often not the most efficient. They were not the ones that would speed up sorting enough to give the CEO's request a same-day turnaround.