From Scalar to Superscalar Processors
In the previous chapter we introduced a five-stage pipeline. The basic concept was that the instruction execution cycle could be decomposed into nonoverlapping stages with one instruction passing through each stage at every cycle. This so-called scalar processor had an ideal throughput of 1, or in other words, ideally the number of instructions per cycle (IPC) was 1.
If we return to the formula giving the execution time, namely,
EXCPU = Number of instructions × CPI × cycle time
we see that in order to reduce EXCPU in a processor with the same ISA – that is, without changing the number of instructions, N – we must either reduce CPI (increase IPC) or reduce the cycle time, or both. Let us look at the two options.
The only possibility to increase the ideal IPC of 1 is to radically modify the structure of the pipeline to allow more than one instruction to be in each stage at a given time. In doing so, we make a transition from a scalar processor to a superscalar one. From the microarchitecture viewpoint, we make the pipeline wider in the sense that its representation is not linear any longer. The most evident effect is that we shall need several functional units, but, as we shall see, each stage of the pipeline will be affected.