Proceedings of the International Astronomical Union: Volume 12 - Astroinformatics

The changing landscape of astrostatistics and astroinformatics
Eric D. Feigelson
Published online by Cambridge University Press:

30 May 2017, pp. 3-9
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
The history and current status of the cross-disciplinary fields of astrostatistics and astroinformatics are reviewed. Astronomers need a wide range of statistical methods for both data reduction and science analysis. With the proliferation of high-throughput telescopes, efficient large scale computational methods are also becoming essential. However, astronomers receive only weak training in these fields during their formal education. Interest in the fields is rapidly growing with conferences organized by scholarly societies, textbooks and tutorial workshops, and research studies pushing the frontiers of methodology. R, the premier language of statistical computing, can provide an important software environment for the incorporation of advanced statistical and computational methodology into the astronomical community.

Supercomputer simulations of structure formation in the Universe
Tomoaki Ishiyama
Published online by Cambridge University Press:

30 May 2017, pp. 10-16
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
We describe the implementation and performance results of our massively parallel MPI†/OpenMP‡ hybrid TreePM code for large-scale cosmological N-body simulations. For domain decomposition, a recursive multi-section algorithm is used and the size of domains are automatically set so that the total calculation time is the same for all processes. We developed a highly-tuned gravity kernel for short-range forces, and a novel communication algorithm for long-range forces. For two trillion particles benchmark simulation, the average performance on the fullsystem of K computer (82,944 nodes, the total number of core is 663,552) is 5.8 Pflops, which corresponds to 55% of the peak speed.

From Sky to Earth: Data Science Methodology Transfer
Ashish A. Mahabal, Daniel Crichton, S. G. Djorgovski, Emily Law, John S. Hughes
Published online by Cambridge University Press:

30 May 2017, pp. 17-26
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
We describe here the parallels in astronomy and earth science datasets, their analyses, and the opportunities for methodology transfer from astroinformatics to geoinformatics. Using example of hydrology, we emphasize how meta-data and ontologies are crucial in such an undertaking. Using the infrastructure being designed for EarthCube - the Virtual Observatory for the earth sciences - we discuss essential steps for better transfer of tools and techniques in the future e.g. domain adaptation. Finally we point out that it is never a one-way process and there is enough for astroinformatics to learn from geoinformatics as well.

What will the future of cloud-based astronomical data processing look like?
Andrew W. Green, Elizabeth Mannering, Lloyd Harischandra, Minh Vuong, Simon O’Toole, Katrina Sealey, Andrew M. Hopkins
Published online by Cambridge University Press:

30 May 2017, pp. 27-31
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
Astronomy is rapidly approaching an impasse: very large datasets require remote or cloud-based parallel processing, yet many astronomers still try to download the data and develop serial code locally. Astronomers understand the need for change, but the hurdles remain high. We are developing a data archive designed from the ground up to simplify and encourage cloud-based parallel processing. While the volume of data we host remains modest by some standards, it is still large enough that download and processing times are measured in days and even weeks. We plan to implement a python based, notebook-like interface that automatically parallelises execution. Our goal is to provide an interface sufficiently familiar and user-friendly that it encourages the astronomer to run their analysis on our system in the cloud—astroinformatics as a service. We describe how our system addresses the approaching impasse in astronomy using the SAMI Galaxy Survey as an example.

Multi-wavelength studies of the statistical properties of active galaxies using Big Data
A. M. Mickaelian, H. V. Abrahamyan, M. V. Gyulzadyan, G. A. Mikayelyan, G. M. Paronyan
Published online by Cambridge University Press:

30 May 2017, pp. 32-38
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
Statistical studies of active galaxies (both AGN and Starburst) using large multi-wavelength data are presented, including new studies of Markarian galaxies, large sample of IR galaxies, variable radio sources, and large homogeneous sample of X-ray selected AGN. Markarian survey (the First Byurakan Survey) was digitized and the DFBS database was created, as the biggest spectroscopic database by the number of objects involved ( ~ 20 million). This database provides both 2D images and 1D spectra. We have carried out a number of projects aimed at revealing and multi-wavelength studies of active galaxies among optical, X-ray, IR and radio sources. Thousands of X-ray sources were identified from ROSAT, including many AGN (52% among all identified sources). IRAS PSC/FSC sources were studied having accurate positions from WISE and a large extragalactic sample was created for further search for AGNs. The fraction of active galaxies among IR-selected galaxies was estimated as 24%. Variable radio sources at 1.4 GHz were revealed by cross-correlation of NVSS and FIRST catalogues using the method introduced by us for optical variability. Radio-X-ray sources were revealed from NVSS and ROSAT for detection of new active galaxies. Big Data in astronomy is described that provide new possibilities for statistical research of active galaxies and other objects.

Learn from every mistake! Hierarchical information combination in astronomy
Maria Süveges, Sotiria Fotopoulou, Jean Coupon, Stéphane Paltani, Laurent Eyer, Lorenzo Rimoldini
Published online by Cambridge University Press:

30 May 2017, pp. 39-45
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
Throughout the processing and analysis of survey data, a ubiquitous issue nowadays is that we are spoilt for choice when we need to select a methodology for some of its steps. The alternative methods usually fail and excel in different data regions, and have various advantages and drawbacks, so a combination that unites the strengths of all while suppressing the weaknesses is desirable. We propose to use a two-level hierarchy of learners. Its first level consists of training and applying the possible base methods on the first part of a known set. At the second level, we feed the output probability distributions from all base methods to a second learner trained on the remaining known objects. Using classification of variable stars and photometric redshift estimation as examples, we show that the hierarchical combination is capable of achieving general improvement over averaging-type combination methods, correcting systematics present in all base methods, is easy to train and apply, and thus, it is a promising tool in the astronomical “Big Data” era.

Exploring the Parameter Space of Compact Binary Population Synthesis
Jim W. Barrett, Ilya Mandel, Coenraad J. Neijssel, Simon Stevenson, Alejandro Vigna-Gómez
Published online by Cambridge University Press:

30 May 2017, pp. 46-50
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
As we enter the era of gravitational wave astronomy, we are beginning to collect observations which will enable us to explore aspects of astrophysics of massive stellar binaries which were previously beyond reach. In this paper we describe COMPAS (Compact Object Mergers: Population Astrophysics and Statistics), a new platform to allow us to deepen our understanding of isolated binary evolution and the formation of gravitational-wave sources. We describe the computational challenges associated with their exploration, and present preliminary results on overcoming them using Gaussian process regression as a simulation emulation technique.

Effects of mergers on non-parametric morphologies
Lucas A. Bignone, Patricia B. Tissera, Emanuel Sillero, Susana E. Pedrosa, Leonardo J. Pellizza, Diego G. Lambas
Published online by Cambridge University Press:

30 May 2017, pp. 51-54
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
We study the effects of mergers on non-parametric morphologies of galaxies. We compute the Gini index, M20, asymmetry and concentration statistics for z = 0 galaxies in the Illustris simulation and compare non-parametric morphologies of major mergers, minor merges, close pairs, distant pairs and unperturbed galaxies. We determine the effectiveness of observational methods based on these statistics to select merging galaxies.

ACS/WFC Pixel History, Bringing the Pixels Back to Science
David Borncamp, Norman Grogin, Matthew Bourque, Sara Ogaz
Published online by Cambridge University Press:

30 May 2017, pp. 55-58
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
Excess thermal energy within a Charged Coupled Device (CCD) results in excess electrical current that is trapped within the lattice structure of the electronics. This excess signal from the CCD itself can be present through multiple exposures, which will have an adverse effect on its science performance unless it is corrected for. The traditional way to correct for this extra charge is to take occasional long-exposure images with the camera shutter closed. These images, generally referred to as “dark” images, allow for the measurement of thermal-electron contamination at each pixel of the CCD. This so-called “dark current” can then be subtracted from the science images by re-scaling to the science exposure times. Pixels that have signal above a certain value are traditionally marked as “hot” and flagged in the data quality array. Many users will discard these pixels as being bad. However, these pixels may not be bad in the sense that they cannot be reliably dark-subtracted; if these pixels are shown to be stable over a given anneal period, the charge can be properly subtracted and the extra Poisson noise from this dark current can be taken into account and put into the error arrays.

Investigation of Spatially Unresolved Magnetic Field Outside Sunspots Using Hinode/SOT Observations
Olga Botygina, Mykola Gordovskyy, Vsevolod Lozitsky
Published online by Cambridge University Press:

30 May 2017, pp. 59-62
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
The structure of photospheric magnetic fields outside sunspots is investigated in three active regions using Hinode/Solar Optical Telescope(SOT) observations. We analyze Zeeman effect in FeI 6301.5 and FeI 6302.5 lines and determine the observed magnetic field value Beff for each of them. We find that the line ratio Beff(6301)/Beff(6302) is close to 1.3 in the range Beff < 0.2 kG, and close to 1.0 for 0.8 kG < Beff < 1.2 kG. We find that the observed magnetic field is formed by flux tubes with the magnetic field strengths 1.3 − 2.3 kG even in places with weak observed magnetic field fluxes. We also estimate the diameters of smallest magnetic flux tubes to be 15 − 20 km.

An Effective Method for Modeling Two-dimensional Sky Background of LAMOST
Hasitieer Haerken, Fuqing Duan, Jiannan Zhang, Ping Guo
Published online by Cambridge University Press:

30 May 2017, pp. 63-66
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
Each CCD of LAMOST accommodates 250 spectra, while about 40 are used to observe sky background during real observations. How to estimate the unknown sky background information hidden in the observed 210 celestial spectra by using the known 40 sky spectra is the problem we solve. In order to model the sky background, usually a pre-observation is performed with all fibers observing sky background. We use the observed 250 skylight spectra as training data, where those observed by the 40 fibers are considered as a base vector set. The Locality-constrained Linear Coding (LLC) technique is utilized to represent the skylight spectra observed by the 210 fibers with the base vector set. We also segment each spectrum into small parts, and establish the local sky background model for each part. Experimental results validate the proposed method, and show the local model is better than the global model.

Application of Compressive Sensing to Gravitational Microlensing Experiments
Asmita Korde-Patel, Richard K. Barry, Tinoosh Mohsenin
Published online by Cambridge University Press:

30 May 2017, pp. 67-70
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
Compressive Sensing is an emerging technology for data compression and simultaneous data acquisition. This is an enabling technique for significant reduction in data bandwidth, and transmission power and hence, can greatly benefit space-flight instruments. We apply this process to detect exoplanets via gravitational microlensing. We experiment with various impact parameters that describe microlensing curves to determine the effectiveness and uncertainty caused by Compressive Sensing. Finally, we describe implications for space-flight missions.

The Euclid Data Processing Challenges
Pierre Dubath, Nikolaos Apostolakos, Andrea Bonchi, Andrey Belikov, Massimo Brescia, Stefano Cavuoti, Peter Capak, Jean Coupon, Christophe Dabin, Hubert Degaudenzi, Shantanu Desai, Florian Dubath, Adriano Fontana, Sotiria Fotopoulou, Marco Frailis, Audrey Galametz, John Hoar, Mark Holliman, Ben Hoyle, Patrick Hudelot, Olivier Ilbert, Martin Kuemmel, Martin Melchior, Yannick Mellier, Joe Mohr, Nicolas Morisset, Stéphane Paltani, Roser Pello, Stefano Pilo, Gianluca Polenta, Maurice Poncet, Roberto Saglia, Mara Salvato, Marc Sauvage, Marc Schefer, Santiago Serrano, Marco Soldati, Andrea Tramacere, Rees Williams, Andrea Zacchei
Published online by Cambridge University Press:

30 May 2017, pp. 73-82
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
Euclid is a Europe-led cosmology space mission dedicated to a visible and near infrared survey of the entire extra-galactic sky. Its purpose is to deepen our knowledge of the dark content of our Universe. After an overview of the Euclid mission and science, this contribution describes how the community is getting organized to face the data analysis challenges, both in software development and in operational data processing matters. It ends with a more specific account of some of the main contributions of the Swiss Science Data Center (SDC-CH).

The European perspective for LSST
Emmanuel Gangler
Published online by Cambridge University Press:

30 May 2017, pp. 83-92
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
LSST is a next generation telescope that will produce an unprecedented data flow. The project goal is to deliver data products such as images and catalogs thus enabling scientific analysis for a wide community of users. As a large scale survey, LSST data will be complementary with other facilities in a wide range of scientific domains, including data from ESA or ESO. European countries have invested in LSST since 2007, in the construction of the camera as well as in the computing effort. This latter will be instrumental in designing the next step: how to distribute LSST data to Europe. Astroinformatics challenges for LSST indeed includes not only the analysis of LSST big data, but also the practical efficiency of the data access.

Everything we’d like to do with LSST data, but we don’t know (yet) how
Željko Ivezić, Andrew J. Connolly, Mario Jurić
Published online by Cambridge University Press:

30 May 2017, pp. 93-102
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
The Large Synoptic Survey Telescope (LSST), the next-generation optical imaging survey sited at Cerro Pachon in Chile, will provide an unprecedented database of astronomical measurements. The LSST design, with an 8.4m (6.7m effective) primary mirror, a 9.6 sq. deg. field of view, and a 3.2 Gigapixel camera, will allow about 10,000 sq. deg. of sky to be covered twice per night, every three to four nights on average, with typical 5-sigma depth for point sources of r=24.5 (AB). With over 800 observations in ugrizy bands over a 10-year period, these data will enable a deep stack reaching r=27.5 (about 5 magnitudes deeper than SDSS) and faint time-domain astronomy. The measured properties of newly discovered and known astrometric and photometric transients will be publicly reported within 60 sec after observation. The vast database of about 30 trillion observations of 40 billion objects will be mined for the unexpected and used for precision experiments in astrophysics. In addition to a brief introduction to LSST, we discuss a number of astro-statistical challenges that need to be overcome to extract maximum information and science results from LSST dataset.

Astroinformatics Challenges from Next-generation Radio Continuum Surveys
Ray P. Norris
Published online by Cambridge University Press:

30 May 2017, pp. 103-113
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
The tens of millions of radio sources to be detected with next-generation surveys pose new challenges, quite apart from the obvious ones of processing speed and data volumes. For example, existing algorithms are inadequate for source extraction or cross-matching radio and optical/IR sources, and a new generation of algorithms are needed using machine learning and other techniques. The large numbers of sources enable new ways of testing astrophysical models, using a variety of “large-n astronomy” techniques such as statistical redshifts. Furthermore, while unexpected discoveries account for some of the most significant discoveries in astronomy, it will be difficult to discover the unexpected in large volumes of data, unless specific software is developed to mine the data for the unexpected.

The Hubble Source Catalog
Stephen H. Lubow
Published online by Cambridge University Press:

30 May 2017, pp. 114-117
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
The Hubble Source Catalog (HSC) is designed to enhance the science obtained from the Hubble Space Telescope by combining the tens of thousands of visit-based source lists in the Hubble Legacy Archive (HLA) across filters and detectors into a single master catalog. The catalog contains data from the major Hubble imaging instruments: Wide Field Planetary Camera 2 (WFPC2), Advanced Camera for Surveys (ACS), and Wide Field Camera 3 (WFC3). It is based on cross matching and astrometry algorithms developed by Budavari & Lubow (2012). We recently released Version 2 that is three times the size of Version 1 and includes some new features. The catalog can be accessed through a variety of interfaces (see http://archive.stsci.edu/hst/hsc/). The HSC provides descriptions of astronomical objects involving multiple wavelengths and epochs. High relative positional accuracy of objects is achieved across the Hubble images, often with sub-pixel precision of a few milliarcseconds.

Pan-STARRS1 as pilot-survey for panoptic time-domain science
Nina Hernitschek, Hans-Walter Rix, Branimir Sesar, Edward F. Schlafly
Published online by Cambridge University Press:

30 May 2017, pp. 118-121
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
For examining possibilities and challenges in doing science with multi-band and non-simultaneous data from upcoming surveys like LSST, the Pan-STARRS1 (PS1) 3π can be used as a pilot survey. This is especially important to explore the possibilities in detection and classification of variable sources within the first years of LSST’s 10-year baseline. We had explored the capabilities of PS1 3π for carrying out time-domain science in a variety of applications. We had used structure function fitting as well as period fitting, to search for and classify high-latitude as well as low-latitude variable sources, in particular RR Lyrae, Cepheids and QSOs.

AlertSim - Serbian Contribution to the LSST
Darko Jevremović, Veljko Vujčić, Vladimir A. Srećković, Jovan Aleksić, Sanja Erkapić, Nenad Milovanović
Published online by Cambridge University Press:

30 May 2017, pp. 122-125
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
We present simulator of alerts for the Large Synoptic Survey Telescope (LSST) developed by Belgrade group. This simulator will be used in testing the functionality of external event brokers/Complex Event Processing (CEP) engines. It is based on current LSST Simulation framework and allows for different classes of objects to be ‘alerted’. A Web service based on our simulator is prototyped and can be accessed by developers of brokers/CEP engines.

Prototype-based Models for the Supervised Learning of Classification Schemes
Michael Biehl, Barbara Hammer, Thomas Villmann
Published online by Cambridge University Press:

30 May 2017, pp. 129-138
- Article
- - You have access
- PDF
- Export citation
- NASA ADS Abstract Service
An introduction is given to the use of prototype-based models in supervised machine learning. The main concept of the framework is to represent previously observed data in terms of so-called prototypes, which reflect typical properties of the data. Together with a suitable, discriminative distance or dissimilarity measure, prototypes can be used for the classification of complex, possibly high-dimensional data. We illustrate the framework in terms of the popular Learning Vector Quantization (LVQ). Most frequently, standard Euclidean distance is employed as a distance measure. We discuss how LVQ can be equipped with more general dissimilarites. Moreover, we introduce relevance learning as a tool for the data-driven optimization of parameterized distances.

Proceedings of the International Astronomical Union

Refine listing

Actions for selected content:

Astroinformatics

Contributed Papers

The changing landscape of astrostatistics and astroinformatics

Supercomputer simulations of structure formation in the Universe

From Sky to Earth: Data Science Methodology Transfer

What will the future of cloud-based astronomical data processing look like?

Multi-wavelength studies of the statistical properties of active galaxies using Big Data

Learn from every mistake! Hierarchical information combination in astronomy

Exploring the Parameter Space of Compact Binary Population Synthesis

Effects of mergers on non-parametric morphologies

ACS/WFC Pixel History, Bringing the Pixels Back to Science

Investigation of Spatially Unresolved Magnetic Field Outside Sunspots Using Hinode/SOT Observations

An Effective Method for Modeling Two-dimensional Sky Background of LAMOST

Application of Compressive Sensing to Gravitational Microlensing Experiments

The Euclid Data Processing Challenges

The European perspective for LSST

Everything we’d like to do with LSST data, but we don’t know (yet) how

Astroinformatics Challenges from Next-generation Radio Continuum Surveys

The Hubble Source Catalog

Pan-STARRS1 as pilot-survey for panoptic time-domain science

AlertSim - Serbian Contribution to the LSST

Prototype-based Models for the Supervised Learning of Classification Schemes

Proceedings of the International Astronomical Union

Refine listing

Actions for selected content:

Save Search

Astroinformatics

Contributed Papers