Parallel computing is the simultaneous execution of the same task (split up and specially adapted) on multiple processors in order to obtain faster results.
The term parallel processor is sometimes used for a computer with more than one central processing unit, available for parallel processing. Systems with thousands of such processors are known as massively parallel.
There are many different kinds of parallel computer (or "parallel processor"). They are distinguished by the kind of interconnection between processors (known as "processing elements" or PEs) and between processors and memory. Flynn's taxonomy also classifies parallel (and serial) computers according to whether all processors execute the same instructions at the same time (single instruction/multiple data -- SIMD) or each processor executes different instructions (multiple instruction/multiple data -- MIMD).
While a system of n parallel processors is not more efficient than one processor of n times the speed, the parallel system is often cheaper to build. Therefore, for tasks which require very large amounts of computation and/or have time constraints on completion, parallel computation is an excellent solution. In fact, in recent years, most high performance computing systems, also known as supercomputers, have a parallel architecture.
Parallel computers are theoretically modeled as parallel random access machines (PRAMs). The PRAM model ignores the cost of interconnection between the constituent computing units, but is nevertheless very useful in providing upper bounds on the parallel solvability of many problems. In reality the interconnection plays a significant role.
It should not be imagined that successful parallel computing is a matter of obtaining the required hardware and connecting it suitably. In practice, linear speedup (i.e., speedup proportional to the number of processors) is very difficult to achieve. This is because many algorithms are essentially sequential in nature. They must be redesigned in order to make effective use of the parallel hardware. Further, programs which work correctly in a single CPU system may not do so in a parallel environment. This is because multiple copies of the same program may interfere with each other, for instance by accessing the same memory location at the same time. Therefore, careful programming is required in a parallel system.
The processors may either communicate in order to be able to cooperate in solving a problem or they may run completely independently, possibly under the control of another processor which distributes work to the others and collects results from them (a "processor farm"). The difficulty of cooperative problem solving is aptly demonstrated by the following dubious reasoning:
- If it takes one man one minute to dig a post-hole then sixty men can dig it in one second.
Processors communicate via some kind of network or bus or a combination of both. Memory may be either shared memory (all processors have equal access to all memory) or private (each processor has its own memory - "distributed memory") or a combination of both.
A huge number of software systems have been designed for programming parallel computers, both at the operating system and programming language level. These systems must provide mechanisms for partitioning the overall problem into separate tasks and allocating tasks to processors. Such mechanisms may provide either implicit parallelism - the system (the compiler or some other program) partitions the problem and allocates tasks to processors automatically or explicit parallelism where the programmer must annotate his program to show how it is to be partitioned. It is also usual to provide synchronisation primitives such as semaphoress and monitors to allow processes to share resources without conflict.
Load balancing attempts to keep all processors busy by moving tasks from heavily loaded processors to less loaded ones.
Communication between tasks may be either via shared memory or message passing. Either may be implemented in terms of the other and in fact, at the lowest level, shared memory uses message passing since the address and data signals which flow between processor and memory may be considered as messages. Topics for a parallel computing article:
- Parallel programming
- Finding parallelism in problems and algorithms
- Optimising compilers
- Cellular automaton.
- Embarrassingly parallel problems
- Grand Challenge problems
- Lazy evaluation vs strict evaluation
- Complexity class NC
- Communicating sequential processes
- Dataflow architecture
- Parallel graph reduction
- Computer cluster
- Parallel supercomputers
- Distributed computing
- NUMA vs. SMP vs. massively parallel computer systems
- Grid computing
- Parallel computer interconnects
- Parallel computer I/O
- Reliability problems in large systems
- Occam programming language
- Linda programming language
- PVM vs. MPI libraries
- ILLIAC III
- ILLIAC IV
- Atari Transputer Workstation
- Beowulf cluster
- Deep Blue
- Meiko Computing Surface
- Blue Gene
This article (or an earlier version of it) contains material from FOLDOC, used with permission.