Skip to main content Accessibility help
  • Print publication year: 2009
  • Online publication date: June 2012

8 - Multithreading and (Chip) Multiprocessing


Parallel processing takes several flavors, depending on the unit of parallelism and the number of processing units. Early on in the development of computer systems, we saw the emergence of multiprogramming (Section 2.3), whereby several (portions of) programs share main memory. When the program currently executing requires some I/O processing, it relinquishes the use of the CPU via a context switch, and another program takes ownership of the CPU. Parallel processing occurs between the program using the CPU and the programs (there may be more than one) executing some I/O-related task. Here, the unit of parallel processing is a program, or process, and the parallelism is at the program level. An efficient implementation of multiprogramming requires the use of an operating system with a virtual memory management component. At the other extreme of the spectrum of granularity, we have the exploitation of instruction-level parallelism. Several instructions of the same program are executing simultaneously. Pipelining (Section 2.1) is the simplest form of the concurrent execution of instructions. Superscalar and EPIC processors (Chapter 3) extend this notion by having several instructions occupying the same stages of the pipeline at the same time. Of course, extra resources such as multiple functional units must be present for this concurrency to happen.

In the previous chapter, we gave a strict definition of multiprocessing, namely, the processing by several processors of portions of the same program.

Agarwal, A., Bianchini, R., Chaiken, D., Johnson, K., Kranz, D., Kubiatowicz, J., Lim, B.-H., Mackenzie, K., and Yeung, D., “The MIT Alewife Machine: Architecture and Performance,” Proc. 22nd Int. Symp. on Computer Architecture, 1995, 2–13
Alverson, R., Callahan, D., Cummings, D., Koblenz, B., Porterfield, A., and Smith, B., “The Tera Computer System,” Proc. Int. Conf. on Supercomputing, 1990, 1–6
Akkary, H. and Driscoll, M., “A Dynamic Multithreading Processor,” Proc. 31st Int. Symp. on Microarchitecture, 1998, 226–236
Agarwal, A., Lim, B.-H., Kranz, D., and Kubiatowicz, J., “APRIL: A Processor Architecture for Multiprocessing,” Proc. 17th Int. Symp. on Computer Architecture, 1990, 104–114
Adiletta, M., Rosenbluth, M., Bernstein, D., Wolrich, G., and Wilkinson, H., “The Next Generation of Intel IXP Network Processors,” Intel Tech. Journal, 6, 3, Aug. 2002, 6–18
Borkenhagen, J., Eickemeyer, R., Kalla, R., and Kunkel, S., “A Multithreaded PowerPC Processor for Commercial Servers,” IBM Journal of Research and Development, 44, 6, 2000, 885–899
Chaudhry, S., Caprioli, P., Yip, S., and Tremblay, M., “High-Performance Throughput Computing,” IEEE Micro, 25, 3, May 2005, 32–45
Dennis, J. and Misunas, D., “A Preliminary Data Flow Architecture for a Basic Data Flow Processor,” Proc. 2nd Int. Symp. on Computer Architecture, 1974, 126–132
Eggers, S., Emer, J., Levy, H., Lo, J., Stamm, R., and Tullsen, D., “Simultaneous Multithreading: A Platform for Next-Generation Processors,” IEEE Micro, 17, 5, Sep. 1997, 12–19
Gschwind, M., Hofstee, H., Flachs, B., Hopkins, M., Watanabe, Y., and Yamazaki, T., “Synergistic Processing in Cell's Multicore Architecture,” IEEE Micro, 26, 2, Mar. 2006, 11–24
Kongetira, P., Aingaran, K., and Olukotun, K., “Niagara: A 32-way Multithreaded Sparc Processor,” IEEE Micro, 24, 2, Apr. 2005, 21–29
Kahle, J., Day, M., Hofstee, H., Johns, C., Maeurer, T., and Shippy, D., “Introduction to the Cell Multiprocessor,” IBM Journal of Research and Development, 49, 4/5, Jul. 2005, 589–604
Koufaty, D. and Marr, D., “Hyperthreading Technology in the Netburst Microarchitecture,” IEEE Micro, 23, 2, Mar. 2003, 56–65
Kalla, R., Sinharoy, B., and Tendler, J., “IBM Power5 Chip: A Dual-Core Multithreaded Processor,” IEEE Micro, 24, 2, Apr. 2004, 40–47
Lo, J., Barroso, L., Eggers, S., Gharachorloo, K., Levy, H., and Parekh, S., “An Analysis of Database Workload Performance on Simultaneous Multithreaded Processors,” Proc. 25th Int. Symp. on Computer Architecture, 1998, 39–50
Mendelson, A., Mandelblat, J., Gochman, S., Shemer, A., Chabukswar, R., Niemeyer, E., and Kumar, A., “CMP Implementation in Systems Based on the Intel Core Duo Processor,” Intel Tech. Journal, 10, 2, May 2006, 99–107
Mathis, H., Mericas, A., McCalpin, J., Eickemeyer, R., and Kunkel, S., “Characterization of Simultaneous Multithreading (SMT) Efficiency in Power5,” IBM Journal of Research and Development, 49, 4, Jul. 2005, 555–564
Mutlu, O., Stark, J., Wilkerson, C., and Patt, Y., “Run-ahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors,” Proc. 9th Int. Symp. on High-Performance Computer Architecture, 2003, 129–140
Smith, B., “A Pipelined, Shared Resource MIMD Computer,” Proc. 1978 Int. Conf. on Parallel Processing, 1978, 6–8
Sohi, G., Breach, S., and Vijaykumar, T., “Multiscalar Processors,” Proc. 22nd Int. Symp. on Computer Architecture, 1995, 414–425
Sohi, G. and Roth, A., “Speculative Multithreaded Processors,” IEEE Computer, 34, 4, Apr. 2001, 66–73
Thornton, J., Design of a Computer: The Control Data 6600, Scott, Foresman and Co., Glenview, IL, 1970
Tremblay, M., Chan, J., Chaudhry, S., Coniglaro, A., and Tse, S., “The MAJC Architecture: A Synthesis of Parallelism and Scalability,” IEEE Micro, 20, 6, Nov. 2000, 12–25
Tullsen, D., Eggers, S., and Levy, H., “Simultaneous Multithreading: Maximizing On-chip Parallelism,” Proc. 22nd Int. Symp. on Computer Architecture, 1995, 392–403