Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-jbqgn Total loading time: 0 Render date: 2024-06-23T14:13:35.703Z Has data issue: false hasContentIssue false

3 - Application development – the basics

from Part II - Application development

Published online by Cambridge University Press:  05 March 2014

Henrique C. M. Andrade
Affiliation:
J. P. Morgan
Buğra Gedik
Affiliation:
Bilkent University, Ankara
Deepak S. Turaga
Affiliation:
IBM Thomas J. Watson Research Center, New York
Get access

Summary

Overview

In this chapter we introduce the fundamentals of application development using the stream processing paradigm. It is organized as follows. In Section 3.2, we provide an overview of the main characteristics of Stream Processing Applications (SPAs), focusing on how they affect the development process. In Section 3.3, we review the existing stream processing languages, deepening the discussion started in Chapter 2. In Section 3.4, we provide an introduction to the SPL language which will allow the readers to follow the examples provided in the rest of the book. In Section 3.5,weprovidealist of commonly used stream processing operators and discuss their roles by looking into their implementation as provided by the SPL standard toolkit.

From this point onwards, we will pair the technical exposition with actual code examples. These examples are provided using the Streams platform and the SPL language. We encourage the readers to try them out and to modify them as they go along.

Characteristics of SPAs

We first look at the characteristics of SPAs and discuss their impact on application development. These characteristics not only influence their design and implementation, but also permeate the design of the programming model and of the application runtime features provided by SPSs.

Data-in-motion analytics

SPAs implement data-in-motion analytics. These analytics ingest data from live, streaming sources and perform their processing as the data flows through the application.

Type
Chapter
Information
Fundamentals of Stream Processing
Application Design, Systems, and Analytics
, pp. 77 - 105
Publisher: Cambridge University Press
Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Babcock, B, Datar, M, Motwani, R. Load shedding in data stream systems. In: Aggarwal, C, editor. Data Streams: Models and Algorithms. Springer; 2007. pp. 127–146.Google Scholar
[2] Chi, Y, Yu, PS, Wang, H, Muntz, RR. LoadStar: a load shedding scheme for classifying data streams. In: Proceedings of the SIAM Conference on Data Mining (SDM). Newport Beach, CA; 2005. pp. 346–357.Google Scholar
[3] Gedik, B, Wu, KL, Yu, PS. Efficient construction of compact source filters for adaptive load shedding in data stream processing. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE). Cancun, Mexico; 2008. pp. 396–405.Google Scholar
[4] Gedik, B, Wu, KL, Yu, PS, Liu, L. GrubJoin: an adaptive, multi-way, windowed stream join with time correlation-aware CPU load shedding. IEEE Transactions on Data and Knowledge Engineering (TKDE). 2007;19(10):1363–1380.CrossRefGoogle Scholar
[5] Tatbul, N, Çetintemel, U, Zdonik, SB, Cherniack, M, Stonebraker, M. Load shedding in a data stream manager. In: Proceedings of the International Conference on Very Large Databases (VLDB). Berlin, Germany; 2003. pp. 309–320.Google Scholar
[6] Tatbul, N, Zdonik, SB. Dealing with overload in distributed stream processing systems. In: Proceedings of the IEEE International Workshop on Networking Meets Databases (NetDB). Atlanta, GA; 2006. p. 24.Google Scholar
[7] Bray, T, Paoli, J, Sperberg-McQueen, CM, Maler, E, Yergeau, F. Extensible Markup Language (XML) 1.0 – Fifth Edition. World Wide Web Consortium (W3C); 2008. http://www.w3.org/TR/REC-xml/.Google Scholar
[8] Crockford, D. The application/json Media Type for JavaScript Object Notation (JSON). The Internet Engineering Task Force (IETF); 2006. RFC 4627.CrossRefGoogle Scholar
[9] Michael, M, Moreira, JE, Wisniewski, DSRW. Scale-up x scale-out: a case study using Nutch/Lucene. In: Proceedings of the IEEE International Conference on Parallel and Distributed Processing Systems (IPDPS). Long Beach, CA; 2007. pp. 1–8.Google Scholar
[10] Hennessy, JL, Patterson, DA. Computer Architecture: A Quantitative Approach. 2nd edn. Morgan Kaufmann; 1996.Google Scholar
[11] Aho, AV, Ullman, JD. Universality of data retrieval languages. In: Proceedings of the ACM Symposium on Principles of Programming Languages (POPL). Chicago, IL; 1979. pp. 110–119.Google Scholar
[12] Arasu, A, Babu, S, Widom, J. The CQL continuous query language: semantic foundations and query execution. Very Large Databases Journal (VLDBJ). 2006;15(2):121–142.Google Scholar
[13] Oracle BEA. EPL Reference Guide. Oracle; 2011. http://docs.oracle.com/cd/E13157_01/wlevs/docs30/epl_guide/index.html.
[14] Terry, D, Goldberg, D, Nichols, D, Oki, B. Continuous queries over append-only databases. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD). San Diego, CA; 1992. pp. 321–330.Google Scholar
[15] ISO. Information Technology – Database Languages – SQL. International Organization for Standardization (ISO); 2011. ISO/IEC 9075.
[16] Codd, EF. Relational completeness of data base sublanguages. In: Rustin, R, editor. Database Systems. Prentice Hall; 1972. pp. 65–98.Google Scholar
[17] StreamBase Systems; retrieved in April 2011. http://www.streambase.com/.
[18] Sybase Aleri Streaming Platform; retrieved March 2011. http://www.sybase.com/products/financialservicessolutions/aleristreamingplatform/.
[19] Abadi, D, Ahmad, Y, Balazinska, M, Çetintemel, U, Cherniack, M, Hwang, JH, et al.The design of the Borealis stream processing engine. In: Proceedings of the Innovative Data Systems Research Conference (CIDR). Asilomar, CA; 2005. pp. 277–289.Google Scholar
[20] IBM InfoSphere Streams; retrieved in March 2011. http://www-01.ibm.com/software/data/infosphere/streams/.
[21] Jain, N, Amini, L, Andrade, H, King, R, Park, Y, Selo, P, et al.Design, implementation, and evaluation of the Linear Road benchmark on the Stream Processing Core. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD). Chicago, IL; 2006. pp. 431–442.Google Scholar
[22] Gedik, B, Andrade, H, Wu, KL, Yu, PS, Doo, M. SPADE: the System S declarative stream processing engine. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD). Vancouver, Canada; 2008. pp. 1123–1134.Google Scholar
[23] Hirzel, M, Andrade, H, Gedik, B, Kumar, V, Losa, G, Mendell, M, et al.Streams Processing Language Specification. IBM Research; 2009. RC-24897.Google Scholar
[24] IBM InfoSphere Streams Version 3.0 Information Center; retrieved June 2011. http://publib.boulder.ibm.com/infocenter/streams/v3r0/ index.jsp.
[25] Aggarwal, C, editor. Data Streams: Models and Algorithms. Springer; 2007.CrossRef
[26] Apache Active MQ; retrieved in April 2011. http://activemq.apache.org/.
[27] IBM WebSphere MQ; retrieved April 2011. http://www.ibm.com/software/integration/wmq/.
[28] Internet Traffic Archive; retrieved in July 2012. http://ita.ee.lbl.gov.

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×