Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-pjpqr Total loading time: 0 Render date: 2024-06-23T13:07:31.005Z Has data issue: false hasContentIssue false

7 - Architecture of a stream processing system

from Part III - System architecture

Published online by Cambridge University Press:  05 March 2014

Henrique C. M. Andrade
Affiliation:
J. P. Morgan
Buğra Gedik
Affiliation:
Bilkent University, Ankara
Deepak S. Turaga
Affiliation:
IBM Thomas J. Watson Research Center, New York
Get access

Summary

Overview

After discussing how SPAs are developed, visualized, and debugged in Chapters 3 to 5, this chapter will focus primarily on describing the architectural underpinnings of a conceptual SPS application runtime environment.

Shifting the discussion to the middleware that supports stream processing provides an opportunity to discuss numerous aspects that affect how an application runs. These aspects include the support for resource management, distributed computing, security, fault tolerance as well as system management services and system-provided application services for logging and monitoring, built-in visualization, debugging, and state introspection.

This chapter is organized as follows. Section 7.2 presents the conceptual building blocks associated with a stream processing runtime: the computational environment and the entities that use it. The second half of this chapter, Sections 7.3 and 7.4, focuses on the multiple services that make up a SPS middleware and describes how they are integrated to provide a seamless execution environment to SPAs.

Architectural building blocks

Middleware is the term used to define a software layer that provides services to applications beyond what is commonly made available by an Operating System (OS), including user and resource management, scheduling, I/O services, among others. In general, middleware software provides an improved environment for applications to execute. This environment referred to as the application runtime environment, further isolates the application from the underlying computational resources.

Therefore, the fundamental role of any middleware is to supply additional infrastructure and services.

Type
Chapter
Information
Fundamentals of Stream Processing
Application Design, Systems, and Analytics
, pp. 203 - 217
Publisher: Cambridge University Press
Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Meng, S, Kashyap, SR, Venkatramani, C, Liu, L. REMO: resource-aware application state monitoring for large-scale distributed systems. In: Proceedings of the International Conference on Distributed Computing Systems (ICDCS). Montreal, Canada; 2009. pp. 248–255.Google Scholar
[2] Nagar, S, van Riel, R, Franke, H, Seetharaman, C, Kashyap, V, Zheng, H. Improving Linux resource control using CKRM. In: Proceedings of the Linux Symposium. Ottawa, Canada; 2004. pp. 511–524.Google Scholar
[3] Globus. WS GRAM Documentation. The Globus Alliance; retrieved in November 2011. http://globus.org/toolkit/docs/3.2/gram/ws/index.html.
[4] Waldspurger, CA. Memory resource management in VMware ESX server. In: Proceedings of the USENIX Symposium on Operating System Design and Implementation (OSDI). Boston, MA; 2002. pp. 181–194.Google Scholar
[5] Sullivan, DG, Seltzer, MI. Isolation with flexibility: a resource management framework for central servers. In: Proceedings of the USENIX Annual Technical Conference. San Diego, CA; 2000. pp. 337–350.Google Scholar
[6] Karczmarek, M. Constrained and Phased Scheduling of Synchronous Data Flow Graphs for StreamIt Language [Masters Thesis]. Massachusetts Institute of Technology; 2002.Google Scholar
[7] Bouganim, L, Fabret, F, Mohan, C, Valduriez, P. Dynamic query scheduling in data integration systems. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE). San Diego, CA; 2000. pp. 425–134.Google Scholar
[8] Graefe, G. Query evaluation techniques for large databases. ACM Computing Surveys. 1993;25(2):73–169.CrossRefGoogle Scholar
[9] Hellerstein, J, Stonebraker, M, editors. Readings in Database Systems. MIT Press; 2005.
[10] Kossman, D. The state of the art in distributed query processing. ACM Computing Surveys. 2000 December;32(4):422–469.CrossRefGoogle Scholar
[11] Abadi, D, Ahmad, Y, Balazinska, M, Çetintemel, U, Cherniack, M, Hwang, JH, et al.The design of the Borealis stream processing engine. In: Proceedings of the Innovative Data Systems Research Conference (CIDR). Asilomar, CA; 2005. pp. 277–289.Google Scholar
[12] Chandrasekaran, S, Cooper, O, Deshpande, A, Franklin, M, Hellerstein, J, Krishnamurthy, S, et al.TelegraphCQ: continuous dataflow processing for an uncertain world. In: Proceedings of the Innovative Data Systems Research Conference (CIDR). Asilomar, CA; 2003. pp. 269–280.Google Scholar
[13] Deshpande, A, Ives, ZG, Raman, V. Adaptive query processing. Foundations and Trends in Databases. 2007;1(1).CrossRefGoogle Scholar
[14] Wolf, J, Khandekar, R, Hildrum, K, Parekh, S, Rajan, D, Wu, KL, et al.COLA: optimizing stream processing applications via graph partitioning. In: Proceedings of the ACM/I-FIP/USENIX International Middleware Conference (Middleware). Urbana, IL; 2009. pp. 308–327.Google Scholar
[15] Wolf, J, Bansal, N, Hildrum, K, Parekh, S, Rajan, D, Wagle, R, et al.SODA: an optimizing scheduler for large-scale stream-based distributed computer systems. In: Proceedings of the ACM/IFIP/USENIX International Middleware Conference (Middleware). Leuven, Belgium; 2008. pp. 306–325.Google Scholar
[16] Saltz, J, Crowley, K, Mirchandaney, R, Berryman, H. Run-time scheduling and execution of loops on message passing machines. Journal of Parallel and Distributed Computing (JPDC). 1990;8(4):303–312.CrossRefGoogle Scholar
[17] Murch, R. Autonomic Computing. IBM Press; 2004.Google Scholar
[18] Jacques-Silva, G, Gedik, B, Andrade, H, Wu, KL. Fault-injection based assessment of partial fault tolerance in stream processing applications. In: Proceedings of the ACM International Conference on Distributed Event Based Systems (DEBS). New York, NY; 2011. pp. 231–242.Google Scholar
[19] Jacques-Silva, G, Gedik, B, Andrade, H, Wu, KL. Language-level checkpointing support for stream processing applications. In: Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). Lisbon, Portugal; 2009. pp. 145–154.Google Scholar
[20] Jacques-Silva, G, Kalbarczyk, Z, Gedik, B, Andrade, H, Wu, KL, Iyer, RK. Modeling stream processing applications for dependability evaluation. In: Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). Hong Kong, China; 2011. pp. 430–441.Google Scholar
[21] Sheu, GW, Chang, YS, Liang, D, Yuan, SM. A fault-tolerant object service on CORBA. In: Proceedings of the International Conference on Distributed Computing Systems (ICDCS). Baltimore, MA; 1997. pp. 393–400.Google Scholar
[22] Hildrum, K, Douglis, F, Wolf, JL, Yu, PS, Fleischer, L, Katta, A. Storage optimization for large-scale distributed stream processing systems. In: Proceedings of the IEEE International Conference on Parallel and Distributed Processing Systems (IPDPS). Long Beach, CA; 2007. pp. 1–8.Google Scholar
[23] Security Enhanced Linux; retrieved in November 2011. http://www.nsa.gov/research/selinux/.
[24] De Pauw, W, Andrade, H. Visualizing large-scale streaming applications. Information Visualization. 2009;8(2):87–106.CrossRefGoogle Scholar
[25] Reyes, JC. A Graph Editing Framework for the StreamIt Language [Masters Thesis]. Massachusetts Institute of Technology; 2004.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×