Skip to main content Accessibility help
  • Print publication year: 2009
  • Online publication date: June 2012

7 - Multiprocessors


Parallel processing has a long history. At any point in time, there always have been applications requiring more processing power than could be delivered by a single-processor system, and that is still as true today as it was three or four decades ago. Early on, special-purpose supercomputers were the rule. The sophisticated designs that were their trademarks percolated down to mainframes and later on to microprocessors. Pipelining and multiple functional units are two obvious examples (cf. the sidebars in Chapter 3). As microprocessors became more powerful, connecting them under the supervision of a single operating system became a viable alternative to supercomputers performancewise, and resulted in cost/performance ratios that made monolithic supercomputers almost obsolete except for very specific applications. This “attack of killer micros” has not destroyed completely the market for supercomputers, but it certainly has narrowed its scope drastically.

In this chapter, we consider multiprocessing, that is, the processing by several processing units of the same program. Distributed applications such as Web servers or search engines, where several queries can be processed simultaneously and independently, are extremely important, but they do not impose the coordination and synchronization requirements of multiprocessing. Moreover, there are a number of issues, such as how to express parallelism in high-level languages and how to have compilers recognize it, that are specific to multiprocessing. We do not dwell deeply on these issues in this chapter; we are more interested at this junction in the architectural aspects of multiprocessing.

Anderson, T., “The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors,” IEEE Trans. on Parallel and Distributed Systems, 1, 1, Jan. 1990, 6–16
Archibald, J. and Baer, J.-L., “An Economical Solution to the Cache Coherence Problem,” Proc. 12th Int. Symp. on Computer Architecture, 1985, 355–362
Archibald, J. and Baer, J.-L., “Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model,” ACM Trans. on Computing Systems, 4, 4, Nov. 1986, 273–298
Adve, S. and Gharachorloo, K., “Shared Memory Consistency Models: A Tutorial,” IEEE Computer, 29, 12, Dec. 1996, 66–76
Agarwal, A., Simoni, R., Hennessy, J., and Horowitz, M., “An Evaluation of Directory Schemes for Cache Coherence,” Proc. 15th Int. Symp. on Computer Architecture, 1988, 280–289
Baetke, F., “The CONVEX Exemplar SPP1000 and SPP1200 – New Scalable Parallel Systems with a Virtual Shared Memory Architecture,” in Dongarra, J., Grandinetti, L., Joubert, G., and Kowalik, J., Eds., High Performance Computing: Technology, Methods and Applications, Elsevier Press, 1995, 81–102
Baer, J.-L. and Wang, W.-H., “On the Inclusion Properties for Multi-Level Cache Hierarchies,” Proc. 15th Int. Symp. on Computer Architecture, 1988, 73–80
Censier, L. and Feautrier, P., “A New Solution to Coherence Problems in Multicache Systems,” IEEE Trans. on Computers, 27, 12, Dec. 1978, 1112–1118
Culler, D. and Singh, J. with Gupta, A., Parallel Computer Architecture: A Hardware/Software Approach, Morgan Kaufmann, San Francisco, 1999
Dally, W., “Virtual-Channel Flow Control,” Proc. 17th Int. Symp. on Computer Architecture, 1990, 60–68
Dubois, M., Scheurich, C., and Briggs, F., “Memory Access Buffering in Multiprocessors,” Proc. 13th Int. Symp. on Computer Architecture, 1986, 434–442
Flynn, M., “Very High Speed Computing Systems,” Proc. IEEE, 54, 12, Dec. 1966, 1901–1909
Gharachorloo, K., Gupta, A., and Hennessy, J., “Two Techniques to Enhance the Performance of Memory Consistency Models,” Proc. Int. Conf. on Parallel Processing, 1991, I-355–364
Graunke, G. and Thakkar, S., “Synchronization Algorithms for Shared-Memory Multiprocessors,” IEEE Computer, 23, 6, Jun. 1990, 60–70
Goodman, J., Vernon, M., and Woest, P., “Efficient Synchronization Primitives for Large-Scale Cache Coherent Multiprocessors,” Proc. 3rd Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Apr. 1989, 64–73
Hill, M., “Multiprocessors Should Support Simple Memory-Consistency Models,” IEEE Computer, 31, 8, Aug. 1998, 28–34
Hennessy, J. and Patterson, D., Computer Architecture: A Quantitative Approach, Fourth Edition, Elsevier Inc., San Francisco, 2007
Jeremiassen, T. and Eggers, S., “Reducing False Sharing on Shared Memory Multiprocessors through Compile Time Data Transformations,” Proc. 5th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, 1995, 179–188
,Kendall Square Research, KSR1 Technology Background, 1992
Kongetira, P., Aingaran, K., and Olukotun, K., “Niagara: A 32-way Multithreaded Sparc Processor,” IEEE Micro, 24, 2, Apr. 2005, 21–29
Kagi, A., Burger, D., and Goodman, J., “Efficient Synchronization: Let them Eat QOLB,” Proc. 24th Int. Symp. on Computer Architecture, 1997, 170–180
Kermani, P. and Kleinrock, L., “Virtual Cut-through: A New Computer Communication Switching Technique,” Computer Networks, 3, 4, Sep. 1979, 267–286
Kapasi, U., Rixner, S., Dally, W., Khailany, B., Ahn, J., Mattson, P., and Owens, J., “Programmable Stream Processors,” IEEE Computer, 36, 8, Aug. 2003, 54–62
Lamport, L., “How to Make a Multiprocessor Computer that Correctly Executes Programs,” IEEE Trans. on Computers, 28, 9, Sep. 1979, 690–691
Lovett, T. and Clapp, R., “STiNG: A CC-NUMA Computer System for the Commercial Marketplace,” Proc. 23rd Int. Symp. on Computer Architecture, 1996, 308–317
Larus, J. and Kozyrakis, C., “Transactional Memory,” Communications of the ACM, 51, 7, Jul. 2008, 80–88
Lovett, T. and Thakkar, S., “The Symmetry Multiprocessor System,” Proc. 1988 Int. Conf. on Parallel Processing, Aug. 1988, 303–310
Papamarcos, M. and Patel, J., “A Low-overhead Coherence Solution for Multiprocessors with Private Cache Memories,” Proc. 12th Int. Symp. on Computer Architecture, 1985, 348–354
Peleg, A. and Weiser, U., “MMX Technology Extension to the Intel Architecture,” IEEE Micro, 16, 4, Aug. 1996, 42–50
Ranganathan, P., Adve, S., and Jouppi, N., “Performance of Image and Video Processing with General-Purpose Processors and Media ISA Extensions,” Proc. 26th Int. Symp. on Computer Architecture, 1999, 124–135
Rudolf, L. and Segall, Z., “Dynamic Decentralized Cache Schemes for MIMD Parallel Processors,” Proc. 11th Int. Symp. on Computer Architecture, 1984, 340–347
Smith, B., “A Pipelined, Shared Resource MIMD Computer,” Proc. 1978 Int. Conf. on Parallel Processing, 1978, 6–8
Scott, S.Synchronization and Communication in the Cray 3TE Multiprocessor,” Proc. 7th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 1996, 26–36
Stunkel, C., Herring, J., Abali, B., and Sivaram, R., “A New Switch Chip for IBM RS/6000 SP Systems,” Proc. Int. Conf. on Supercomputing, 1999, 16–33
Sweazey, P. and Smith, A., “A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Future Bus,” Proc. 13th Int. Symp. on Computer Architecture, 1986, 414–423
Slingerland, N. and Smith, A., “Multimedia Extensions for General-Purpose Microprocessors: A Survey,”Microprocessors and Microsystems, 29, 5, Jan. 2005, 225–246
Tremblay, M. and O'Connor, J., “UltraSparc I: A Four-issue Processor Supporting Multimedia,” IEEE Micro, 16, 2, Apr. 1996, 42–50
Tucker, L. and Robertson, G., “Architecture and Applications of the Connection Machine,” IEEE Computer, 21, 8, Aug. 1988, 26–38