Skip to main content Accessibility help
×
Hostname: page-component-7479d7b7d-rvbq7 Total loading time: 0 Render date: 2024-07-08T07:29:05.717Z Has data issue: false hasContentIssue false

15 - Scientific workflows for the geosciences: An emerging approach to building integrated data analysis systems

from Part V - Web services and scientific workflows

Published online by Cambridge University Press:  25 October 2011

Ilkay Altintas
Affiliation:
University of California-San Diego
Daniel Crawl
Affiliation:
University of California-San Diego
Christopher J Crosby
Affiliation:
University of California-San Diego
Peter Cornillon
Affiliation:
University of Rhode Island
G. Randy Keller
Affiliation:
University of Oklahoma
Chaitanya Baru
Affiliation:
University of California, San Diego
Get access

Summary

Scientific method and the influence of technology

Due to the increasing number and sophistication of data acquisition technologies, the amount of raw data acquired has vastly increased over the last couple of decades (Berman, 2008). This explosion of scientific data, growth in scientific knowledge, and the increase in the number of studies that require access to knowledge from multiple scientific disciplines amplify the complexity of scientific problems. In order to answer these “grand challenge” scientific questions, scientists use computational methods that are evolving almost daily. The basic scientific method, however, remains the same for the individual scientist. Scientists still start with a set of questions, then observe phenomena, gather data, develop hypotheses, perform tests, negate or modify hypotheses, reiterate the process with various data, and finally come up with a new set of questions, theories, or laws (http://en.wikipedia.org/wiki/Scientific_method). A recent change in this scientific method is that it is continuously being transformed with the advances in computer science and technology. The simplest examples of this transformation are use of personal computers to record scientific activity and the way scientists publish and search for publications online. More advanced technologies within the scientific process include sensor-based observatories to collect data in real time, supercomputers to run simulations, domain-specific data archives that give access to heterogeneous data, and online interfaces to distribute computational experiments and monitor resources.

Type
Chapter
Information
Geoinformatics
Cyberinfrastructure for the Solid Earth Sciences
, pp. 237 - 250
Publisher: Cambridge University Press
Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abramson, D., Bethwaite, B., Enticott, C., Garic, S., and Peachey, T. (2009). Parameter space exploration using scientific workflows. ICCS 2009, Baton Rouge, LA, USA, May 2009.Google Scholar
Abramson, D., Enticott, C., and Altintas, I. (2008). Nimrod/K: Towards massively parallel dynamic Grid workflows. In Proceedings of Supercomputing 2008 (SC 2008): p. 24.
Altintas, I., Barney, O., and Jaeger-Frank, E. (2006). Provenance collection support in the Kepler scientific workflow system. In Proceedings of International Provenance and Annotation Workshop (IPAW2006), pp. 118–132.CrossRef
Altintas, I., Jaeger, E., Lin, K., Ludaescher, B., and Memon, A. (2004). A web service composition and deployment framework for scientific workflows. In 2nd International Conference on Web Services (ICWS), San Diego, California, July 2004.Google Scholar
Altintas, I., Lin, A. W., Chen, J.et al. (2010). CAMERA 2.0: A Data-Centric Metagenomics Community Infrastructure Driven by Scientific Workflows. IEEE 2010 Fourth International Workshop on Scientific Workflows, Miami, FL, USA.
Barker, A. and Hemert, J. (2008). Information Scientific Workflow: A Survey and Research Directions. LNCS 4967. Berlin: Springer, pp. 746–753.Google Scholar
Barseghian, D., Altintas, I., and Jones, M. B. (2008). Accessing and using sensor data within the Kepler scientific workflow system. In Proceedings of Environmental Information Management Conference, ed. Gries, C. and Jones, M. B.. pp. 26–32.
Barseghian, D., Altintas, I., Jones, M. B.et al. (2010). Workflows and extensions to the Kepler scientific workflow system to support environmental sensor data access and analysis. Ecological Informatics, 5: 42–50.CrossRefGoogle Scholar
Berman, F. (2008). Got data? A guide to data preservation in the information age. Communications of the ACM, 51: 12, 50–56.CrossRefGoogle Scholar
Carter, W. E., Shrestha, R., and Slatton, K. C. (2007). Geodetic laser scanning. Physics Today, 60(12): 41–47.CrossRefGoogle Scholar
Carter, W. E., Shrestha, R. L., Tuell, G., Bloomquist, D., and Sartori, M. (2001). Airborne laser swath mapping shines new light on Earth's topography. Eos Transactions AGU, 82: 549–550, 555.CrossRefGoogle Scholar
Crawl, D. and Altintas, I. (2008). A provenance-based fault tolerance mechanism for scientific workflows. In Proceedings of International Provenance and Annotation Workshop (IPAW 2008), Salt Lake City, UT, USA, pp. 152–159.CrossRefGoogle Scholar
Cuadrado, D. L. (2008). Automated distribution simulation in Ptolemy II. Ph.D. thesis, Aalborg University.
Deelman, E., Blythe, J., Gil, Y.et al. (2004). Pegasus: Mapping scientific workflows onto the Grid. In European Across Grids Conference, pp. 11–20.CrossRef
Freire, C. T., Silva, S. P., Callahan, E.et al. (2006). Managing Rapidly-Evolving Scientific Workflows. In International Provenance and Annotation Workshop (IPAW), LNCS 4145. Berlin: Springer, pp. 10–18.Google Scholar
Fricke, T. T., Ludaescher, B., Altintas, I.et al. (2004). Integration of Kepler with ROADNet: Visual dataflow design with real-time geophysical data. AGU Fall Meeting, San Francisco, CA, USA, December, 13–17, 2004.
Goderis, A., Brooks, C., Altintas, I., Lee, E. A., and Goble, C. A. (2007). Composing different models of computation in Kepler and Ptolemy II. International Conference on Computational Science, 3: 182–190.Google Scholar
Jaeger-Frank, E., Crosby, C. J., Memon, A.et al. (2006). A Three Tier Architecture for LiDAR Interpolation and Analysis, LNCS 3993. Berlin: Springer, pp. 920–927.Google Scholar
Kim, H., Arrowsmith, J. R., Crosby, C. J.et al. (2006). An efficient implementation of a local binning algorithm for digital elevation model generation of LiDAR/ALSM dataset. Eos Transactions AGU, 87(52), Fall Meet. Suppl., Abstract G53C-0921.Google Scholar
Leinfelder, B., Altintas, I., Barseghian, D.et al. (2009). An integrated approach to managing workflow runs and generating reports in Kepler. In Eighth Biennial Ptolemy Miniconference, April 2009.
Ludäscher, B., Altintas, I., Berkley, C.et al. (2006). Scientific workflow management and the Kepler system. Concurrency and Computation: Practice & Experience, 18(10): 1039–1065.CrossRefGoogle Scholar
Ludäscher, B., Podhorszki, N., Altintas, I., Bowers, S., and McPhillips, T. M. (2008). From computation models to models of provenance: The RWS approach. Concurrency and Computation: Practice & Experience, 20(5): 507–518.CrossRefGoogle Scholar
Mouallem, P., Crawl, D., Altintas, I., Vouk, M., and Yildiz, U. (2010). A Fault-Tolerance Architecture for Kepler-Based Distributed Scientific Workflows. In SSDBM 2010, ed. Gertz, M. and Ludascher, B.. LNCS 6187. Berlin: Springer, pp. 452–460.Google Scholar
Nandigam, V., Baru, C., and Crosby, C. J. (2010). Database design for high-resolution LIDAR topography data. In SSDBM 2010, ed. Gertz, M. and Ludascher, B.. LNCS 6187. Berlin: Springer, pp. 151–159.Google Scholar
,OPeNDAP: Open-source Project for a Network Data Access Protocol, http://opendap.org/, 2010.
Pennington, D. D., Higgins, D., Peterson, A. T.et al. (2007). Ecological niche modeling using the Kepler workflow system. In Workflows for e-Science Scientific Workflows for Grids, ed. Taylor, I. J, Deelman, E., Gannon, D. B. and Shields, M.. New York: Springer, pp. 91–108.Google Scholar
Podhorszki, N., Klasky, S., Liu, Q.et al. (2009). Plasma fusion code coupling using scalable I/O services and scientific workflows. In Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science (WORKS09) at Supercomputing 2009 (SC2009) Conference. Portland, OR: ACM.Google Scholar
Podhorszki, N., Ludaescher, B., and Klasky, S. (2007). Workflow automation for processing plasma fusion simulation data. In 2nd Workshop on Workflows in Support of Large-Scale Science (WORKS07) at the 16th International Symposium on High-Performance Distributed Computing (HPDC-16 2007), Monterey, CA, USA, 2007.Google Scholar
Prentice, C. S., Crosby, C. J., Whitehill, C. Set al. (2009). Illuminating northern California's active faults. Eos Transactions AGU, 90(7): 55–56.
Sallenger, A. H., Krabill, W., Swift, R.et al. (2003). Evaluation of airborne scanning lidar for coastal change applications. Journal of Coastal Research, 19(1): 125–133.
Smanchat, S., Indrawan, M., Ling, S., Enticott, C., and Abramson, D. (2009). Scheduling multiple parameter sweep workflow instances on the Grid. IEEE e-Science 2009.CrossRef
Sudholt, W., Altintas, I., and Baldridge, K. (2006). Scientific workflow infrastructure for computational chemistry on the Grid. International Conference on Computational Science, 3: 69–76.Google Scholar
Taylor, I. (2006). Triana generations. In 2nd International Conference on e-Science and Grid Technologies (e-Science). New York: IEEE Computer Society, p. 143.Google Scholar
Turi, D., Missier, P., Goble, C., Roure, D. D., and Oinn, T. (2007). Taverna workflows: Syntax and semantics. In eScience, Bangalore, India, pp. 441–448.Google Scholar
Vouk, M. A., Altintas, I., Barreto, R.et al. (2007). Automation of network-based scientific workflows. Proceedings of the IFIP WoCo 9 on Grid-based Problem Solving Environments: Implications for Development and Deployment of Numerical Software, IFIP WG 2.5 on Numerical Software, Prescott, AZ, USA, In Grid-Based Problem Solving Environments, ed. Gaffney, P. W and Pool, J. C. T. IFIP, Vol. 239. Boston: Springer, pp. 35–61.CrossRefGoogle Scholar
Wang, J., Altintas, I., Berkley, C., Gilbert, L., and Jones, M. B. (2008). A high-level distributed execution framework for scientific workflows. In Proceedings of Workshop SWBES08: Challenging Issues in Workflow Applications, 4th IEEE International Conference on e-Science (e-Science 2008), New York, pp. 634–639.Google Scholar
Wang, J., Altintas, I., Hosseini, P. R.et al. (2009a). Accelerating parameter sweep workflows by utilizing ad-hoc network computing resources: An ecological example. In Proceedings of IEEE 2009 Third International Workshop on Scientific Workflows (SWF 2009), 2009 Congress on Services (Services 2009), pp. 267–274.
Wang, J., Crawl, D., and Altintas, I. (2009b). Kepler + Hadoop : A general architecture facilitating data-intensive applications in scientific workflow systems. In Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science (WORKS09) at Supercomputing 2009 (SC2009) Conference. Portland, OR: ACM.Google Scholar
Wang, J., Korambath, P., Kim, S.et al. (2010). Theoretical enzyme design using the Kepler scientific workflows on the Grid. Accepted by 5th Workshop on Computational Chemistry and Its Applications (5th CCA) at International Conference on Computational Science (ICCS 2010), Amsterdam, The Netherlands, 2010.Google Scholar
Yu, J. and Buyya, R. (2005). A taxonomy of scientific workflow systems for Grid computing. In ACM SIGMOD Record, ed. Ludaescher, B. and Goble, C.. Special Issue on Scientific Workflows, 34(3).CrossRef

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×