Due to unplanned maintenance of the back-end systems supporting article purchase on Cambridge Core, we have taken the decision to temporarily suspend article purchase for the foreseeable future. We apologise for any inconvenience caused whilst we work with the relevant teams to restore this service.
Models of Parallel Computation
There is a perpetual need for faster computation which is unlikely to be ever satisfied. With device technologies hitting physical limits, alternate computational models are being explored. The Big Data phenomenon precedes the coinage of this term by many decades. One of the earliest and natural direction to speed-up computation was to deploy multiple processors instead of a single processor for running the same program. The ideal objective is to speed-up a program p-fold by using p processors simultaneously. A common caveat is that an egg cannot be boiled faster by employing multiple cooks! Analogously, a program cannot be executed faster indefinitely by using more and more processors. This is not just because of physical limitations but dependencies between various fragments of the code, imposed by precedence constraints.
At a lower level, namely, in digital hardware design, parallelism is inherent – any circuit can be viewed as a parallel computational model. Signals travel across different paths and components and combine to yield the desired result. In contrast, a program is coded in a very sequential manner and the data flows are often dependent on each other – just think about a loop that executes in a sequence. Second, for a given problem, one may have to re-design a sequential algorithm to extract more parallelism. In this chapter, we focus on designing fast parallel algorithms for fundamental problems.
A very important facet of parallel algorithm design is the underlying architecture of the computer, viz., how do the processors communicate with each other and access data concurrently. Moreover, is there a common clock across which we can measure the actual running time? Synchronization is an important property that makes parallel algorithm design somewhat more tractable. In more generalized asynchronous models, there are additional issues like deadlock and even convergence, which are very challenging to analyze.
In this chapter, we will consider synchronous parallel models (sometimes called SIMD) and look at two important models – parallel random access machine (PRAM) and the interconnection network model. The PRAM model is the parallel counterpart of the popular sequential RAM model where p processors can simultaneously access a common memory called shared memory.