Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-v5vhk Total loading time: 0 Render date: 2024-07-07T21:44:39.046Z Has data issue: false hasContentIssue false

8 - InfoSphere Streams architecture

from Part III - System architecture

Published online by Cambridge University Press:  05 March 2014

Henrique C. M. Andrade
Affiliation:
J. P. Morgan
Buğra Gedik
Affiliation:
Bilkent University, Ankara
Deepak S. Turaga
Affiliation:
IBM Thomas J. Watson Research Center, New York
Get access

Summary

Overview

In this chapter, we switch the focus from a conceptual description of the SPS architecture to the specifics of one such system, InfoSphere Streams. The concepts, entities, services, and interfaces described in Chapter 7 will now be made concrete by studying the engineering foundations of Streams. We start with a brief recount of Streams' research roots and historical context in Section 8.2. In Section 8.3 we discuss user interaction with Streams’ application runtime environment.

We then describe the principal components of Streams in Section 8.4. We focus on how these components interact to form a cohesive runtime environment to support users and applications sharing a Streams instance. In the second half of this chapter we focus on services (Section 8.5), delving into the internals of Streams' architectural components, providing a broader discussion of their service APIs and their steady state runtime life cycles. We discuss Streams with a top-down description of its application runtime environment, starting with the overall architecture, followed by the specific services provided by the environment.

Finally, we discuss the facets of the architecture that are devoted to supporting application development and tuning.

Background and history

InfoSphere Streams can trace its roots to the System S middleware, which was developed between 2003 and 2009 at IBM Research [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]. The architectural foundations and programming language model in Streams are based on counterparts in System S.

Type
Chapter
Information
Fundamentals of Stream Processing
Application Design, Systems, and Analytics
, pp. 218 - 272
Publisher: Cambridge University Press
Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Amini, L, Andrade, H, Bhagwan, R, Eskesen, F, King, R, Selo, P, et al.SPC: a distributed, scalable platform for data mining. In: Proceedings of the Workshop on Data Mining Standards, Services and Platforms (DM-SSP). Philadelphia, PA; 2006. pp. 27–37.Google Scholar
[2] De Pauw, W, Andrade, H. Visualizing large-scale streaming applications. Information Visualization. 2009;8(2):87–106.CrossRefGoogle Scholar
[3] De Pauw, W, Andrade, H, Amini, L. StreamSight: a visualization tool for large-scale streaming applications. In: Proceedings of the Symposium on Software Visualization (SoftVis). Herrsching am Ammersee, Germany; 2008. pp. 125–134.Google Scholar
[4] Gedik, B, Andrade, H, Frenkiel, A, De Pauw, W, Pfeifer, M, Allen, P, et al.Debugging tools and strategies for distributed stream processing applications. Software: Practice & Experience. 2009;39(16):1347–1376.Google Scholar
[5] Jacques-Silva, G, Challenger, J, Degenaro, L, Giles, J, Wagle, R. Towards autonomic fault recovery in System S. In: Proceedings of the IEEE/ACM International Conference on Autonomic Computing (ICAC). Jacksonville, FL; 2007. p. 31.Google Scholar
[6] Jacques-Silva, G, Gedik, B, Andrade, H, Wu, KL. Language-level checkpointing support for stream processing applications. In: Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). Lisbon, Portugal; 2009. pp. 145–154.Google Scholar
[7] Park, Y, King, R, Nathan, S, Most, W, Andrade, H. Evaluation of a high-volume, low-latency market data processing sytem implemented with IBM middleware. Software: Practice & Experience. 2012;42(1):37–56.Google Scholar
[8] Turaga, D, Andrade, H, Gedik, B, Venkatramani, C, Verscheure, O, Harris, JD, et al.Design principles for developing stream processing applications. Software: Practice & Experience. 2010;40(12):1073–1104.Google Scholar
[9] Wang, H, Andrade, H, Gedik, B, Wu, KL. A code generation approach for auto-vectorization in the SPADE compiler. In: Proceedings of the International Workshop on Languages and Compilers for Parallel Computing (LCPC). Newark, DE; 2009. pp. 383–390.Google Scholar
[10] Wolf, J, Bansal, N, Hildrum, K, Parekh, S, Rajan, D, Wagle, R, et al.SODA: an optimizing scheduler for large-scale stream-based distributed computer systems. In: Proceedings of the ACM/IFIP/USENIX International Middleware Conference (Middleware). Leuven, Belgium; 2008. pp. 306–325.Google Scholar
[11] Wolf, J, Khandekar, R, Hildrum, K, Parekh, S, Rajan, D, Wu, KL, et al.COLA: optimizing stream processing applications via graph partitioning. In: Proceedings of the ACM/I-FIP/USENIX International Middleware Conference (Middleware). Urbana, IL; 2009. pp. 308–327.Google Scholar
[12] Wu, KL, Yu, PS, Gedik, B, Hildrum, KW, Aggarwal, CC, Bouillet, E, et al.Challenges and experience in prototyping a multi-modal stream analytic and monitoring application on System S. In: Proceedings of the International Conference on Very Large Databases (VLDB). Vienna, Austria; 2007. pp. 1185–1196.Google Scholar
[13] Zhang, X, Andrade, H, Gedik, B, King, R, Morar, J, Nathan, S, et al.Implementing a highvolume, low-latency market data processing system on commodity hardware using IBM middleware. In: Proceedings of the Workshop on High Performance Computational Finance (WHPCF). Portland, OR; 2009. article no. 7.Google Scholar
[14] Beynon, M, Ferreira, R, Kurc, T, Sussman, A, Saltz, J. DataCutter: middleware for filtering very large scientific datasets on archival storage systems. In: Proceedings of the IEEE Symposium on Mass Storage Systems (MSS). College Park, MD; 2000. pp. 119–134.Google Scholar
[15] Balakrishnan, H, Balazinska, M, Carney, D, Çetintemel, U, Cherniack, M, Convey, C, et al.Retrospective on Aurora. Very Large Databases Journal (VLDBJ). 2004;13(4): 370–383.Google Scholar
[16] Abadi, D, Ahmad, Y, Balazinska, M, Çetintemel, U, Cherniack, M, Hwang, JH, et al.The design of the Borealis stream processing engine. In: Proceedings of the Innovative Data Systems Research Conference (CIDR). Asilomar, CA; 2005. pp. 277–289.Google Scholar
[17] Chandrasekaran, S, Cooper, O, Deshpande, A, Franklin, M, Hellerstein, J, Hong, W, et al.TelegraphCQ: continuous dataflow processing. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD). San Diego, CA; 2003. pp. 329–338.Google Scholar
[18] Arasu, A, Babcock, B, Babu, S, Datar, M, Ito, K, Motwani, R, et al.STREAM: the Stanford stream data manager. IEEE Data Engineering Bulletin. 2003;26(1):665.Google Scholar
[19] Thies, W, Karczmarek, M, Amarasinghe, S. StreamIt: a language for streaming applications. In: Proceedings of the International Conference on Compiler Construction (CC). Grenoble, France; 2002. pp. 179–196.Google Scholar
[20] IBM InfoSphere Streams Version 3.0 Information Center; retrieved in June 2011. http://publib.boulder.ibm.com/infocenter/streams/v3r0/index.jsp.
[21] Christensen, E, Curbera, F, Meredith, G, Weerawarana, S. Web Services Description Language (WSDL) 1.1. World Wide Web Consortium (W3C); 2001. http://www.w3.org/TR/wsdl.
[22] Gudgin, M, Hadley, M, Mendelsohn, N, Moreau, JJ, Nielsen, HF, Karmarkar, A, et al.SOAP Version 1.2 Part 1: Messaging Framework (Second Edition). World Wide Web Consortium (W3C); 2007. http://www.w3.org/TR/soap12-part1/.Google Scholar
[23] Clayberg, E, Rubel, D. Eclipse Plug-ins. 3rd edn. Addison Wesley; 2008.Google Scholar
[24] Cormen, TH, Leiserson, CE, Rivest, RL. Introduction to Algorithms. MIT Press and McGraw Hill; 1990.Google Scholar
[25] Stevens, WR. UNIX Network Programming: Networking APIs, Sockets and XTI (Volume 1). 2nd edn. Prentice Hall; 1998.Google Scholar
[26] IBM WebSphere MQ Low Latency Messaging; retrieved in September 2010. http://www-01.ibm.com/software/integration/wmq/11m/.
[27] Tanenbaum, A, Wetherall, D. Computer Networks. 5th edn. Prentice Hall; 2011.Google Scholar
[28] On-Demand Marshalling and De-Marshalling of Network Messages; 2008. Patent application filed as IBM Docket YOR920090029US1 in the United States.
[29] Wagle, R, Andrade, H, Hildrum, K, Venkatramani, C, Spicer, M. Distributed middleware reliability and fault tolerance support in System S. In: Proceedings of the ACM International Conference on Distributed Event Based Systems (DEBS). New York, NY; 2011. pp. 335–346.Google Scholar
[30] Jacques-Silva, G, Kalbarczyk, Z, Gedik, B, Andrade, H, Wu, KL, Iyer, RK. Modeling stream processing applications for dependability evaluation. In: Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). Hong Kong, China; 2011. pp. 430–441.Google Scholar
[31] Jacques-Silva, G, Gedik, B, Andrade, H, Wu, KL. Fault-injection based assessment of partial fault tolerance in stream processing applications. In: Proceedings of the ACM International Conference on Distributed Event Based Systems (DEBS). New York, NY; 2011. pp. 231–242.Google Scholar
[32] De Pauw, W, Letia, M, Gedik, B, Andrade, H, Frenkiel, A, Pfeifer, M, et al.Visual debugging for stream processing applications. In: Proceedings of the International Conference on Runtime Verification (RV). St Julians, Malta; 2010. pp. 18–35.Google Scholar
[33] Geisshirt, K. X/Open Single Sign-On Service (XSSO) – Pluggable Authentication. The Open Group; 1997. P702.Google Scholar
[34] Dierks, T, Rescorla, E. The Transport Layer Security (TLS) Protocol Version 1.2. The Internet Engineering Task Force (IETF); 2008. RFC 5246.CrossRefGoogle Scholar
[35] Zeilenga, K, Lonvick, C. The Secure Shell (SSH) Protocol Architecture. The Internet Engineering Task Force (IETF); 2006. RFC 4251.Google Scholar
[36] Zeilenga, K. Lightweight Directory Access Protocol (LDAP): Technical Specification Road Map. The Internet Engineering Task Force (IETF); 2006. RFC 4510.Google Scholar
[37] Fallside, DC, Walmsley, P. XML Schema Part 0: Primer – Second Edition. World Wide Web Consortium (W3C); 2004. http://www.w3.org/TR/xmlschema-0/.Google Scholar
[38] Sun Microsystems. NFS: Network File System Protocol Specification. The Internet Engineering Task Force (IETF); 1989. RFC 1094.
[39] IBM General Parallel File System; retrieved in November 2011. http://www-03.ibm.com/systems/software/gpfs/.
[40] Stallman, RM, CygnusSupport. Debugging with GDB – The GNU Source-Level Debugger. Free Software Foundation; 1996.Google Scholar
[41] Valgrind Developers. Valgrind User Manual (Release 3.6.0 21). Apache Software Foundation; 2010.

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×