Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 The Basics
- 3 Superscalar Processors
- 4 Front-End: Branch Prediction, Instruction Fetching, and Register Renaming
- 5 Back-End: Instruction Scheduling, Memory Access Instructions, and Clusters
- 6 The Cache Hierarchy
- 7 Multiprocessors
- 8 Multithreading and (Chip) Multiprocessing
- 9 Current Limitations and Future Challenges
- Bibliography
- Index
- References
5 - Back-End: Instruction Scheduling, Memory Access Instructions, and Clusters
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 The Basics
- 3 Superscalar Processors
- 4 Front-End: Branch Prediction, Instruction Fetching, and Register Renaming
- 5 Back-End: Instruction Scheduling, Memory Access Instructions, and Clusters
- 6 The Cache Hierarchy
- 7 Multiprocessors
- 8 Multithreading and (Chip) Multiprocessing
- 9 Current Limitations and Future Challenges
- Bibliography
- Index
- References
Summary
When an instruction has passed through all stages of the front-end of an out-of-order superscalar, it will either be residing in an instruction window or be dispatched to a reservation station. In this chapter, we first examine several schemes for holding an instruction before it is issued to one of the functional units. We do not consider the design of the latter; hence, this chapter will be relatively short. Some less common features related to multimedia instructions will be described in Chapter 7.
In a given cycle, several instructions awaiting the result of a preceding instruction will become ready to be issued. The detection of readiness is the wakeup step. Hopefully there will be as many as m instructions in an m-way superscalar, but maybe more, that have been woken up in this or previous cycles. Since several of them might vie for the same functional unit, some scheduling algorithm must be applied. Most scheduling algorithms are a variation of first-come–first-served (FCFS or FIFO). Determination of which instructions should proceed takes place during the select step. Once an instruction has been selected for a given functional unit, input operands must be provided. Forwarding, also called bypassing, must be implemented, as already shown in the simple pipelines of Chapter 2 and the examples of Chapter 3.
One particular instruction type that is often found to be on the critical path is the load instruction. As we have seen before, load dependencies are a bottleneck even in the simplest processors.
- Type
- Chapter
- Information
- Microprocessor ArchitectureFrom Simple Pipelines to Chip Multiprocessors, pp. 177 - 207Publisher: Cambridge University PressPrint publication year: 2009