Skip to main content Accessibility help


  • Access
  • Cited by 5



      • Send article to Kindle

        To send this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

        Note you can select to send to either the or variations. ‘’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

        Find out more about the Kindle Personal Document Service.

        Building Models for Extended Radio Sources: Implications for Epoch of Reionisation Science
        Available formats

        Send article to Dropbox

        To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

        Building Models for Extended Radio Sources: Implications for Epoch of Reionisation Science
        Available formats

        Send article to Google Drive

        To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

        Building Models for Extended Radio Sources: Implications for Epoch of Reionisation Science
        Available formats
Export citation


We test the hypothesis that limitations in the sky model used to calibrate an interferometric radio telescope, where the model contains extended radio sources, will generate bias in the Epoch of Reionisation power spectrum. The information contained in a calibration model about the spatial and spectral structure of an extended source is incomplete because a radio telescope cannot sample all Fourier components. Application of an incomplete sky model to calibration of Epoch of Reionisation data will imprint residual error in the data, which propagates forward to the Epoch of Reionisation power spectrum. This limited information is studied in the context of current and future planned instruments and surveys at Epoch of Reionisation frequencies, such as the Murchison Widefield Array (MWA), Giant Metrewave Radio Telescope and the Square Kilometre Array (SKA1-Low). For the MWA Epoch of Reionisation experiment, we find that both the additional short baseline uv-coverage of the compact Epoch of Reionisation array, and the additional long baselines provided by TGSS and planned MWA expansions, are required to obtain sufficient information on all relevant scales. For SKA1-Low, arrays with maximum baselines of 49 km and 65 km yield comparable performance at 50 MHz and 150 MHz, while 39 km, 14 km, and 4 km arrays yield degraded performance.


A sky model for an interferometer telescope is used for calibration, source deconvolution, and source subtraction. The sky model can be obtained: (1) externally, from prior observations with existing telescopes at the same frequency; (2) internally, via measurement with telescope itself; (3) with a combination of these. For existing experiments, the current suite of sky surveys provide the input sky models for calibration and source subtraction. These can be augmented with further surveys from the same, or upgraded versions, of existing telescopes. Complete and accurate sky models are crucial for calibrating data, and subtracting unwanted sources from the dataset. This is particularly important for Epoch of Reionisation (EoR) experiments, which aim to extract a weak signal from bright foreground contamination (Jacobs et al. Reference Jacobs2016; Carroll et al. Reference Carroll2016; Line et al. Reference Line, Webster, Pindor, Mitchell and Trott2017; Barry et al. Reference Barry, Hazelton, Sullivan, Morales and Pober2016).

For the southern sky, the current suite of low-frequency sky surveys, used for building a sky model, include: the 74 MHz Very Large Array Low Frequency Sky Survey redux (Lane et al. Reference Lane, Cotton, Helmboldt and Kassim2012), the MWA Commissioning Survey (MWACS, Hurley-Walker et al. Reference Hurley-Walker2014), MWA GLEAM (Wayth et al. Reference Wayth2015; Hurley-Walker et al. Reference Hurley-Walker2017), and GMRT TGSS (Intema et al. Reference Intema, Jagannathan, Mooley and Frail2017). Cross-matching tools, such as the Positional Update and Matching Algorithm (PUMA, Line et al. Reference Line, Webster, Pindor, Mitchell and Trott2017), combine these spatially and spectrally to provide the calibration sky model. Procopio et al. (Reference Procopio2017) explored the impact of adding GMRT TGSS information to the calibration and source subtraction model for MWA EoR in the EoR1 field, finding that the additional double-source and extended source information was important for reducing bias. An extension of that work explores the direct impact of imprecisely-modelled extended sources on the MWA EoR experiment, and we undertake that work here.

Source modelling is typically performed for physical insight into the source itself (e.g., spectral structure of radio lobes to understand their energetics) and often includes multi-wavelength information to constrain physical models. Such studies typically rely on image-plane maps (e.g., Perley, Dreher, & Cowan Reference Perley, Dreher and Cowan1984; Salter et al. Reference Salter1989; Castelletti et al. Reference Castelletti, Dubner, Brogan and Kassim2007; Braun Reference Braun2013; McKinley et al. Reference McKinley2015; Procopio et al. Reference Procopio2017, and references therein). Perley et al. (Reference Perley, Dreher and Cowan1984), for example, studies the multi-frequency spatial structure of Cygnus A. The image data are represented on a discretised grid at a set of frequencies, and this forms a representation of the underlying information, some of which has been lost. This is a commensurability problem, whereby continuous data are represented discretely. Braun (Reference Braun2013), for example, then uses the spatial power in these frequency slices to estimate the structural properties of the source. A subset of the full data has therefore been used for the analysis.

Image reconstruction and analysis tools are designed with minimal information loss considerations, but these are imperfect. Performing analysis in the measurement plane offers the greatest ability to preserve information, but can be computationally and algorithmically challenging compared with the image plane. In this work, we consider the full information content of the data, taking this as the best possible outcome (most optimistic), while appreciating that using a segmented image-plane representation will likely degrade the result. In Section 2, we explore the impact of discretisation more formally.

The SKA aims to be the world’s foremost radio telescope, and aims to make major scientific advances across a range of programs. The EoR and Cosmic Dawn (CD) experiment (Koopmans et al. Reference Koopmans2015) is envisaged to be one of the most challenging, and is one of the SKA High Priority Science Objectives. Prior to its construction and commissioning, existing and past facilities are providing the sky models for the current generation of southern hemisphere EoR experiments, such as the MWA (Tingay et al. Reference Tingay2013), the Precision Array for Probing the Epoch of Reionization (PAPER)Footnote 1 (Parsons et al. Reference Parsons2010), and Hydrogen Epoch of Reionization Array (HERA)Footnote 2 (DeBoer et al. Reference DeBoer2017). Compared with the SKA, current low frequency sky surveys have poorer sensitivity and spatial resolution, limiting the ability of these surveys to accurately measure the spatial and spectral structure of complex sources.

For SKA1-Low, with its long baselines and good snapshot uv-coverage, we expect that SKA1-Low itself will form the primary sky model. Therefore, the sky model will be well-sampled on modes measured by SKA, and not well-measured on scales that are not. We note that, in general, the application to calibration of an incomplete sky model formed exclusively by the same instrument being calibrated, will likely produce ‘re-substitution bias’, where the performance of a calibration procedure is over-estimated (similar to the re-use of a training set for the real dataset in machine learning).

In this paper, we discuss the impact of the current and future suite of sky surveys at EoR frequencies to enable EoR and CD science, with particular reference to the EoR and CD power spectrum. In the current era, we study the availability of sufficient information from existing and imminent sky surveys to measure the values for the parameters describing the spatial and spectral structure of extended sources. For the future, we discuss the implications for the planned SKA1-Low array configuration to execute its ambitious EoR/CD program. We use data information content, as the basis for quantitatively assessing the performance of the instrument under a set of defined designs. We begin with the original MWA Phase I design, and subsequently add TGSS, and the additional baselines of the upgraded MWA, and show whether the EoR science experiment is biased by the incomplete information available to correctly represent extended sources. Attention is then turned to the future SKA1-Low array, and the more challenging EoR/CD experiments proposed. We use these results to inform the smallest array maximum baseline required to execute these experiments.


Calibration of a radio interferometer requires estimation of the unknown complex gain parameters of each station. It typically relies on the fitting of data to a model of the sky signal plus instrument, allowing freedom in the complex gain parameters to perform the least-squares fit. Successful calibration therefore requires good knowledge of the received sky signal and the instrument (station locations, sky response, etc.). This is partially true for purely redundant arrays (e.g., PAPER, HERA), where sky information is used for initial calibration estimates and breaking degeneracies (e.g., Omnical; Ali et al. Reference Ali2015).

In this work, the Fisher Information is used to quantify the ability of a given array configuration to estimate the sky model parameters for a generalised extended source, in the presence of a realistic background of point sources. We use the residual model uncertainties as a measure of the error (uncertainty) in the model, and propagate these uncertainties into visibilities measured by the telescope for EoR science. We further propagate these uncertainties into the EoR power spectrum, yielding a measure of the error (power bias) due to the incomplete source model. In studying different SKA array configurations, we focus on removal of outer stations to reduce the maximum baseline, but do not re-locate stations to the array core. For the generalised extended source, we are careful to model structure on all scales of relevance for the SKA, in order to robustly and generally assess calibration performance (scales from the size of the PSF to the FOV). We apply the same model throughout, for the SKA and also the MWA and TGSS-derived results. This is to ensure consistency, and to test the ability of precursor instruments to form a sky model of relevance for the future SKA. For the MWA and TGSS, the existence of power on scales much finer than those available to their baselines, will test whether unmeasured power on small scales impacts the estimation of longer wavemodes, of relevance to the EoR.

2.1. Approach

We use the Fisher Information, and then the Cramer-Rao Bound (Kay Reference Kay1993, CRB) to quantify the information available in our calibration data to estimate the values of the spectral and spatial parameters describing a generalised extended source, embedded within measurement noise, and a realistic sky of extragalactic point sources. The calibration model is obtained from the sky survey used to construct the sky model.

The Fisher Information computes the amount of information a given dataset (with a particular probability distribution function of noise, here modelled to include radiometric and background sources) contains about the values of a model parameter, for a pre-defined source model. In general, measurements that vary rapidly with a varying parameter value have the ability to estimate its value precisely. In contrast, no change of the expected measurement with a varying parameter value means that there is no information available to estimate that parameter. The CRB takes into account degeneracies between model parameters and correlations in the data, and represents the estimation performance of an ideal estimation algorithm.

For complex-valued data, embedded within generalised Gaussian noise with covariance, ${\bm C}$ , and an expected signal vector $\vec{\mu }=\vec{\mu }(\vec{\theta })$ with parameters $\vec{\theta }$ , the Fisher Information Matrix has the following elements:

(1) $$\begin{equation} [I]_{ab} = \left( \frac{\partial \vec{\mu }}{\partial {\theta }_a} \right)^\dagger \, {\bm C}^{-1} \, \left( \frac{\partial \vec{\mu }}{\partial {\theta }_b} \right), \end{equation}$$

where ab are two elements of the parameter vector. In general, the data covariance can also be a function of unknown parameters, but here we assume we have full knowledge of the per-visibility properties of the radiometric noise:

(2) $$\begin{equation} {_{th}{\bf C}}(u,v;\nu ,\nu ^\prime ) = \left(\frac{2k{\rm T_{sys}}}{{\rm A_{eff}}}\right)^2\frac{1}{{\Delta \nu \Delta {t}}} \delta (\nu -\nu ^\prime ) \,\,{\rm Jy}^2, \end{equation}$$

and the point source covariance (Trott et al. Reference Trott2016) as a function of spectral channels ν, ν′ and Fourier mode $\vec{u}=u,v$ :

(3) $$\begin{eqnarray} _{fg}{\bf C}(\vec{u};\nu ,\nu ^\prime ) &=& \frac{\alpha }{3-\beta }\frac{S_{\rm max}^{3-\beta }}{S_0^{-\beta }}\frac{\pi {c^2}\epsilon ^2}{D^2} \frac{1}{\nu ^2 + \nu ^{\prime {2}}}\nonumber \\ &\times & \exp {\left( \frac{-|\vec{u}|^2c^2f(\nu )^2\epsilon ^2}{4(\nu ^2 + \nu ^{\prime {2}})D^2} \right)} \, {\rm Jy}^2, \end{eqnarray}$$

where ε = 0.42 converts an Airy disk to a Gaussian characteristic width, D is the station diameter, and f(ν) = (ν − ν′)/ν0. The point source model is represented by a broken power-law, characterised by parameters α, β, such that in a sky area d l:

(4) $$\begin{equation} \langle {N(S,S+dS)}\rangle (\nu ) = \frac{dN}{dS}(\nu )\,dS\,d{\bf l} \end{equation}$$\\
(5) $$\begin{equation} = \alpha \left( \frac{\nu }{\nu _0} \right)^{\gamma } \left( \frac{S_{\rm Jy}}{S_0}\right)^{-\beta }\,dS\,d{\bf l}. \end{equation}$$
We use values of α = 4100Jy−1sr−1, β = 1.59, and γ = −0.8 at 150 MHz (Intema et al. Reference Intema, van Weeren, Röttgering and Lal2011). We assume that our sky model is formed from 15 min (1 h) of data for GMRT (MWA), reducing the radiometric noise component with respect to the confusion, and for spectral resolution commensurate with the EoR experiment (100 kHz). (The GMRT TGSS survey used 15-min per field, split over 3–5 pointings.) Observations are assumed to be of a field at declination −27º, centred on RA=0.

The CRB yields the uncertainties, and correlations, for each parameter value. If we assume that these uncertainties are then embedded within the sky model for that extended source, we can propagate these errors into the science data measured by the telescope (the measured visibility dataset). This is achieved using a standard Jacobian and the Fisher Information, I, such that

(6) $$\begin{equation} {\bf C}_V(u,v;\nu ) = {\bm J}^\dagger \, {\bm I}^{-1} \, {\bm J}, \end{equation}$$

where ${\bm J}$ is the matrix of derivatives of the visibility measured at u, v, and channel ν. We further propagate from measured visibilities to the EoR power spectrum:

(7) $$\begin{equation} \Delta {P}(k_\bot ,k_\parallel ) = \left( \mathcal {F}_\nu ^\dagger \, \mathcal {W}^\dagger {\bf C}_V \mathcal {W}\, \mathcal {F}_\nu \right) \, \delta (k_\parallel -k_\parallel ^\prime ,k_\bot ^2-u^2-v^2), \end{equation}$$

where $\mathcal {F}_\nu$ is the Fourier Transform operator along the spectral direction, $\mathcal {W}$ is a spectral taper function that aims to reduce spectral leakage, and the delta-function extracts the variance estimates (power) from the covariance matrix as well as identifying the perpendicular k modes with the L 2-norm of the angular Fourier modes (k 2 = u 2 + v 2). Herein, we employ a Blackman taper.

Therefore, we can assess the error introduced into our science data by an incomplete extended source model produced from a given array configuration.

Fourier Transform of the data to the image plane, and discretisation into surface brightness pixels, transforms the visibility covariance matrix (equation 6) into a degraded image covariance, such that

(8) $$\begin{equation} {\bf C}_I(l_i,m_i;\nu ) = \bm{\mathcal {D}_i}^\dagger \bm{\mathcal {F}}^\dagger \bm{J}^\dagger \, \bm{I}^{-1} \, \bm{J} \bm{\mathcal {F}} \bm{\mathcal {D}_i}, \end{equation}$$

where $\bm{\mathcal {D}_i}$ and $\bm{\mathcal {F}}$ denote the discretisation operator (continuous-to-discrete) and spatial Fourier operator (discrete-to-continuous), respectively, and the Jacobians act on the data used to estimate the source properties. Inadequate discretisation sampling compared with scales and shapes of the underlying source components, will degrade the quality of the estimation, and this can be quantified by studying the mean-squared-error (MSE) between the actual underlying source structure and the discretised representation. This can be seen most simply by considering a Fourier Transform back to the visibility plane (to estimate scale sizes), where a discrete-to-discrete transform couples pixel properties into the data covariance matrix. At this point, both information loss and bias in the estimates, are possible.

2.2. Generalised extended source model

We want to form the most general extended source, in order to provide a fair basis for comparison. We aim to build a model with multi-scale structure that encases the spatial modes accessible to the Baseline Design SKA1-Low, corresponding to scales of the synthesized beam (10 arcsec at 150 MHz) to a fraction of the field-of-view (0.08º). We build a model with a complete set of angular scales, spaced evenly in the range k = [570–17 000] rad−1, and construct a source based on the summation over a series of Gaussians, each with five parameters: central location l i , m i (rad), peak brightness S i (Jy/beam for SKA-Low beam at 150 MHz), characteristic scale σ i (rad), and spectral index γ i . This corresponds to 57 angular scales (k = [570–17 000] rad−1, sampling evenly at a spacing of half of the lowest k-mode), yielding 285 total parameters. The parameters for each scale are generated via a Gaussian-distributed random sampling, as described in Table 1, and fixed thereafter for each array configuration. The Fourier-space expectation of the signal (μ in Equation 1) is given by

(9) $$\begin{eqnarray} \mu (\vec{u},\nu ;\vec{\theta }) &=& \sqrt{2\pi } \displaystyle \sum _{i=1}^{57} \, a_i \sigma _i^2 \left(\frac{\nu }{\nu _0}\right)^{\gamma _i} \nonumber \\ &\times & \exp {(-2\pi {i}\vec{u}\cdot \vec{l}_i)} \, \exp {(-2|u|^2\pi ^2\sigma _i^2)}, \end{eqnarray}$$

where the two exponentials encode the Fourier kernel, and the multi-scale Gaussian, respectively, and $\vec{\theta }=[a,l,m,\sigma ,\gamma ]$ .

Table 1. Parameter values as a function of scale, k i . $\mathcal {N}(\mu ,{\rm var})$ denotes a Gaussian-distributed random number with mean μ and standard deviation $\sqrt{{\rm var}}$ .

Figure 1 shows an image of the complete model. Note that although a base source peak brightness of 1 Jy/beam has been chosen for the source, the amplitude here is not relevant to the final error introduced into the visibilities. This is because brighter sources can be estimated more precisely, and weaker sources less precisely, with a linear scaling with peak brightness. The propagation of error back into the visibilities also scales linearly with peak brightness, and these scalings cancel (Trott, Wayth, & Tingay Reference Trott, Wayth and Tingay2012). Nonetheless, the peak brightness refers to an SKA-Low beam at 150 MHz, and remains consistent for all models. Therefore, each extended source with these angular scales being estimated for the sky model will contribute this error.

Figure 1. Image of the extended source produced using the model described (lowest spectral channel).

2.3. Extended sources within the field

The procedure described above yields the power bias from a single extended source in the sky model for a given field. If we now consider the total number of extended sources in a given field, we can estimate the total power bias in the EoR power spectrum to be N srcΔP(k , k ), where N src is the number in the field. This implies that the extended sources considered here all have the same power spectrum of spatial and spectral structure, although each individual source will have a different realisation of the parameter values. This ensures we are considering a consistent set of sources.

There is no complete low-frequency census of extended sources, due to the limited sensitivity and spatial resolution under study in this work. Results from the high-resolution FIRST (1.4 GHz) (Becker et al. Reference Becker, White, Helfand, Crabtree, Hanisch and Barnes1994) and TGSS catalogues (Intema et al. Reference Intema, Jagannathan, Mooley and Frail2017) can be used to estimate the number of sources per unit area of a given angular scale and flux density. Procopio et al. (Reference Procopio2017) showed that the brightest sources were most important to model because these tended to be closer and therefore more extended.

We use the study of Windhorst, Mathis, & Neuschaefer (Reference Windhorst, Mathis, Neuschaefer and Kron1990) at 1.4 GHz to estimate the 150 MHz and 50 MHz distributions, assuming a spectral index of −0.8 between the bands and no structural evolution. They find that the fraction of sources larger than size ψ (arcsec) at flux density, S (mJy), can be described by

(10) $$\begin{equation} h(\psi ) = \exp {[-\ln {2}(\psi /\psi _m)^{0.62}]}, \end{equation}$$

where ψ m = 2.0S 0.3 1.4 arcsec is the median size at 1.4 GHz. We scale the flux densities to 150 MHz to find the corresponding median size for low frequencies. Figure 2 shows this fraction as a function of flux density and size. Strictly this distribution is applicable for flux densities below 1 Jy: the extended, close radio galaxies (e.g., Fornax A, Centaurus A, Pictor A), which have much larger flux densities, have their own individual spatial and spectral distributions. Assuming a point source number count distribution, a field-of-view and this fraction, the number of sources larger than a given angular size and greater than some flux density limit can be approximated as

(11) $$\begin{equation} N(>\psi ,S_0) = \displaystyle \int _{S_0}^\infty \frac{dN}{dS}\,dS\,\Omega \, h(\psi ) = \displaystyle \int _{S_0}^\infty \alpha S^{-\beta }\, \Omega \, h(\psi ), \end{equation}$$

where α = 4000(ν/150)−0.8 Jy−1sr−1, β = {1.59 (S < 1 Jy), 2.5 (S > 1 Jy)}, parametrise the number count distribution (Intema et al. Reference Intema, van Weeren, Röttgering and Lal2011), and Ω is the field-of-view (steradians). Figure 3 shows the associated contour plots for the MWA 150 MHz, SKA 150 MHz, and SKA 50 MHz experiments. At the flux density confusion limits of these experiments, we would expect ~1 source in each of the experiments, of scale larger than ~1 arcmin. Most of the weaker sources have angular extents of 10 s of arcseconds, and these have fewer critical parameters for estimation. Given these estimates, we focus on the bright, distributed nearby radio galaxies in this work, and consider a single source of importance in the field.

Figure 2. Fraction of sources with angular extent greater than a given size, as a function of source flux density, for 150 MHz (adapted from Windhorst et al. Reference Windhorst, Mathis, Neuschaefer and Kron1990).

Figure 3. Number of sources with angular extent and flux density greater than given values for the MWA 150 MHz (left), SKA 150 MHz (centre), and SKA 50 MHz (right) experiments.

2.4. Telescope arrays

2.4.1. Murchison widefield array

The first three years of MWA EoR observations form the basis for all published limits to date, and use the Phase I MWA configuration with 128 tiles. The GLEAM survey uses the same array for its observations. The GMRT TGSS survey is a re-processing of the GMRT 150 MHz sky survey with new methods for instrument and ionospheric calibration (Intema et al. Reference Intema, Jagannathan, Mooley and Frail2017). The additional sensitivity of the GMRT, and the longer baselines, provides higher spatial resolution, but the small number of dishes (30) limits the additional uv-coverage. The upgraded MWA will provide additional surface brightness sensitivity and spatial resolution with its 256 tiles. The zenith-pointed uv-coverage for these are displayed in Figure 4 (left). GMRT clearly adds information at high spatial resolution, but even with rotation synthesis, the coverage is sparse and the larger angular scales are not significantly improved.

Figure 4. (Left) uv coverage for a zenith snapshot pointing for the three MWA arrays: MWA Phase I (blue), +GMRT (green), MWA Phase III + GMRT (red). (Right) uv coverage for a zenith snapshot pointing for the four SKA1 arrays considered: black (max. baseline 65 km), red (49 km), green (39 km), blue (14 km), (λ = 1 m).

GMRT has different system temperature, latitude, and dish sensitivity (effective area) compared with the MWA, and these are taken into account in the analysis. Earth Rotation synthesis is considered when computing the uv-coverage, assuming an EoR field-of-interest located at declination −27º (MWA latitude). The GMRT TGSS survey parameters (15 min integration time per field) are used as the basis for the sky model formed from these different hybrid arrays.

2.4.2. SKA1-Low configurations

We trial four array configurations, which are all subsets of the Baseline Distribution. Table 2 describes the number of stations, and maximum baseline, for each. Figure 4 (right) shows the zenith snapshot uv-coverage of each at 150 MHz. We also considered a 4 km baseline array, which effectively corresponds to an array with EoR-science scales. Additionally, we consider a variation on the Baseline Design, whereby the inner clusters of six stations are unpacked for improved instantaneous imaging performance (Jones et al. Reference Jones, Mort, Dulwich, Wayth, Abeywickrema and Bolton2016).

Table 2. The four arrays considered.

The station effective areas, and sky temperature as a function of frequency, are taken from the SKA1-Low system description (L0 Requirements). We compute the EoR/CD power spectrum error at two prime frequencies of interest (150 MHz, z = 8.6; 50 MHz, z = 27), and compare to typical expected 21 cm cosmological power spectra, from 21cmFAST simulations (Mesinger, Furlanetto, & Cen Reference Mesinger, Furlanetto and Cen2011). Standard conversions are undertaken from Jy2 Hz2 to mK2h−3 Mpc3 (Morales & Hewitt Reference Morales and Hewitt2004).


We form the power biases due to extended sources requiring estimation for formation of the sky model, and compare these with expected 21 cm power spectra obtained from 21cmFASTv2 (reionisation via faint galaxies).

3.1. MWA

The MWA EoR experiment aims to detect the 21 cm signal through a power spectrum at redshifts, z = 6.5 − 9.0 (ν = 137–197 MHz). We take a nominal lower frequency of 150 MHz and estimate the power bias due to the three hybrid arrays. Figure 5 displays the signal-to-power bias ratio (SNR).

Figure 5. Signal-to-noise (contrast) ratios of a typical 21 cm cosmological signal to the power error introduced by one extended source in the field, for the three MWA-based hybrid array configurations considered, and a 10 MHz bandwidth experiment centred at 150 MHz (z = 8.6).

Addition of the small scales from TGSS does not provide a substantial improvement in performance on EoR scales, and this hybrid array and the original MWA 128-tile array display contrast ratios of order 0.1 in the lowest portion of the EoR window. Extending the array to 256 tiles and including TGSS increases the uv-coverage on all scales, leading to an improvement and contrast ratios exceeding unity across the EoR window.

The improvement in parameter estimation can be explored to understand the relative importance of uv-coverage and maximum baseline in measuring the source parameters. Figure 6 display results for the five source parameters as a function of angular scale of the feature (σ). Each plot shows the ratio of estimation performance (square-root of CRB) for the extended MWA (256 tiles + TGSS) relative to the original MWA128 and MWA128 + TGSS. Also displayed are histograms of the u (solid) and v (dashed) distributions for each array.

Figure 6. (Top, middle, bottom left) Ratio of estimation performance (precision) for extended MWA (256 tiles + TGSS) relative to the original MWA128 (red) and MWA128 + TGSS (blue), as a function of scale of source feature. (Bottom right) Histograms of baseline distributions in u (solid) and v (dashed) directions for MWA256 + TGSS (red), MWA128 + TGSS (blue), and MWA128 (green).

It is clear that the additional long north-south baselines for the MWA256 + TGSS offer improved performance on small scales for estimating the m-position of each source feature. It is also clear that the increased number of short baselines from the MWA256 hexagonal arrays improves estimation of large scale features. Addition of the TGSS long baselines aids the precision. The clear and intuitive conclusion is that for sources with information on a range of angular scales, both long baselines and good short baseline coverage are required in both dimensions. Particularly for EoR scales, excellent short baseline uv-coverage is crucial.

3.2. SKA1-Low

Figures 7 and 8 display the signal-to-noise ratios (power contrast ratios) for the four arrays considered. The dashed and solid black lines denote the first sidelobe and horizon limits for the expected leakage of foreground power (the ‘wedge’). At 150 MHz, all four arrays yield SNR >10 across most of the EoR Window (area outside of the horizon wedge). The Blackman taper performs well to suppress foreground leakage, but at the expense of a broader DC term, and residual error leaks into the EoR window at k ≃ 0.02, k ≃ 0.1 Mpc−1 for all arrays. The relative strength of the 21 cm signal to the sky temperature yields acceptable performance for the 49 km and 65 km arrays, while degradation is evident for 39 km and 14 km. Notably, the 49 km and 65 km arrays yield comparable results (power ratio ≃1.004).

Figure 7. Signal-to-noise (contrast) ratios of a typical 21 cm cosmological signal to the power error introduced by one extended source in the field, for the four array configurations considered, and a 10 MHz bandwidth experiment centred at 150 MHz (z = 8.6).

Figure 8. Signal-to-noise (contrast) ratios of a typical 21 cm cosmological signal to the power error introduced by one extended source in the field, for the four array configurations considered, and a 10 MHz bandwidth experiment centred at 50 MHz (z = 27).

At 50 MHz, the system temperature is higher, foregrounds are brighter, and the signal is weaker. Therefore, the performance is degraded for all arrays, relative to 150 MHz. Both the 14 km and 39 km arrays yield a low SNR detection across a large region of the EoR window, while 49 km and 65 km yield good performance (SNR> 103). Again, the 49 km and 65 km arrays yield acceptable performance (high contrast ratios).

The final metric of interest for 21 cm studies, where the cosmological signal is expected to be isotropic, is the spherically-averaged (1D) power spectrum. To remove the bulk of the foreground extended source bias, we consider line-of-sight scales larger than k = 0.1 (smaller spatial scales). In doing so, foregrounds are reduced when averaging spherically, but we also lose spatial modes. Figure 9 displays these 1D profiles, obtained directly from the power on each measured baseline at each frequency (i.e., not obtained from 2D, but considering the original 3D distribution). The same conclusions can be drawn about the relative merits of each array at the lowest frequency (50 MHz). The different structural properties stem from the interaction of the Blackman–Nuttall spectral taper and each array’s spectral covariance profiles. Fundamentally, it is the progression in k-mode of the additional leaked power as the arrays become more compact, that is of concern. Here, we also provide results from the 39 km maximum baseline, but using the v7 proposed array of Jones et al. (Reference Jones, Mort, Dulwich, Wayth, Abeywickrema and Bolton2016) with unpacked clusters (green dashed). Here, the addition of uv information on intermediate scales provides some improvement relative to the Baseline Design, and can recover information lost from cutting from 49 km to 39 km maximum baselines.

Figure 9. (Left) 150 MHz spherically-averaged power spectrum and simulated 21cmFAST 21 cm power for comparison (dashed). 14 km (blue), 39 km (green), 49 km (red), 65 km (black). Note that the black, red and green are overlapping. (Right) Same but for 50 MHz, including an alternative array configuration with 39 km baselines (green, dashed).

It is important to understand where the differences in the arrays are contributing to the differences in the performance. Intuitively, sources that are extended, but have power on angular scales smaller than that for the EoR, will not contribute power bias into the EoR parameter space. Similarly, sources that have power on large angular scales, but no power on small scales, contribute to EoR baselines, but each of the 39 km, 49 km, and 65 km arrays sample these scales equally. Therefore, we hypothesise that it is sources that have both large and small-scale structures that couple inability to measure small scales into the estimation of the larger scales. To test this hypothesis, we perform the same analysis at 50 MHz with two additional sources: one source with power only above 1 arcmin scales (broad source), and one source with power only on scales of 10 arcsec–36 arcsec (compact source). When comparing the 39 km and 65 km arrays for their power bias in the EoR power spectrum, we find that both yield comparable performance (power ratio approximately unity). This supports the hypothesis that sources that have multi-scale power are those that show differences in estimation precision for the different arrays.

3.3. Discussion and interpretation

The key message of this work is that a sky model generated from a given array, and subsequently used as ‘truth’ (the reference model for a source for calibration and subtraction), has inherent errors, due limitations of that array, and that those errors will propagate forward into science data products. Here, we considered a generalised multi-scale Gaussian source (extended source), and used mock SKA1-Low and existing and upcoming MWA + TGSS array configurations to estimate their model parameters, in the presence of a realistic sky with structured sidelobes from other sky sources. We applied these errors to model EoR data, and quantified the bias in power in the EoR 2D and 1D power spectra due to these incomplete source models. Compared with previous work in this area, the use of a generalised extended source model, the inclusion of other confusing sources in the field (classical and sidelobe confusion), and the direct equating of modes measured by a survey to input calibration model ‘truth’ parameters, expands on existing studies.

For the MWA EoR experiment, we find that both the additional short baseline uv-coverage, and the additional MWA256 + TGSS long baselines, are required to obtain sufficient information on all relevant scales, for a high signal-to-noise ratio detection.

For SKA1-Low, we find that arrays with maximum baselines of 49 km and 65 km yield comparable performance at 50 MHz and 150 MHz, while 39 km, 14 km, and 4 km arrays yield degraded performance. This is particularly true at CD (low) frequencies, where SKA1 is aiming to be transformational and new. We therefore conclude that 49 km maximum baselines are sufficient to form the sky and calibration model for EoR/CD science, but 39 km baselines, corresponding to removal of two clusters from each spiral arm, yield degraded results and threaten high-redshift Cosmic Dawn science. This is not true for the same longest baseline but an array with more intermediate scales (i.e., the v7 design). Such alterations of the inner array may be used to alleviate some of the degradation caused by removing the outer stations. We additionally find that it is multi-scale sources, which have power on large and small scales, that are those that yield different power bias for the 39 km and 65 km arrays. It is these sources for which the sky model estimation will yield differences depending on array design.


The broad design of SKA1-Low is relatively fixed, with an expectation of an aperture array interferometer of ~130 000 dipoles collected into ~500 stations, spanning tens of kilometres, and with a densely-packed core of 50–60% of the collecting area within 2–3 km (Dewdney Reference Dewdney2016). This broad model provides the exceptional surface brightness sensitivity and wide frequency coverage to address the exciting science goals of the observatory. The specific details of station location and maximum array baseline are under discussion. To contribute to that discussion, here we study the impact of incomplete sky models of extended sources on EoR and CD science for MWA and SKA. The recommendations for the design of SKA1-Low are (1) an unpacking of inner station clusters; (2) a minimum longest baseline of 50 km. Combining these recommendations may improve performance but at an overall reduced cost.


CMT thanks Robert Braun and Ben McKinley for useful discussions. The Centre for All-Sky Astrophysics (CAASTRO) is an Australian Research Council Centre of Excellence, funded by Grant CE110001020. The Centre for All-Sky Astrophysics in 3D (ASTRO 3D) is an Australian Research Council Centre of Excellence, funded by Grant CE170100013. This research has made use of NASA’s Astrophysics Data System. CMT is supported under the Australian Research Council’s Discovery Early Career Researcher funding scheme (project number DE140100316). This work was supported by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia. We acknowledge the International Centre for Radio Astronomy Research (ICRAR), a Joint Venture of Curtin University and The University of Western Australia, funded by the Western Australian State government.


Ali, Z. S., et al. 2015, ApJ, 809, 61 10.1088/0004-637X/809/1/61 2015ApJ...809...61A
Barry, N., Hazelton, B., Sullivan, I., Morales, M. F., & Pober, J. C. 2016, MNRAS, 461, 3 135 10.1093/mnras/stw1380 2016MNRAS.461.3135B
Becker, R. H., White, R. L., & Helfand, D. J. 1994, in ASP Conf. Ser. Vol. 61, Astronomical Data Analysis Software and Systems III, eds. Crabtree, D. R., Hanisch, R. J., & Barnes, J. (San Francisco: ASP), 165
Braun, R. 2013, A&A, 551, A91 10.1051/0004-6361/201220257 2013A&A...551A..91B
Carroll, P. A., et al. 2016, MNRAS, 461, 4151 10.1093/mnras/stw1599 2016MNRAS.461.4151C
Castelletti, G., Dubner, G., Brogan, C., & Kassim, N. E. 2007, A&A, 471, 537 10.1051/0004-6361:20077062 2007A&A...471..537C
DeBoer, D. R., et al. 2017, PASP, 129, 045001 10.1088/1538-3873/129/974/045001 2017PASP..129d5001D
Dewdney, P. 2016, SKA Memo Series, SKA-TEL-SKO-0000002
Hurley-Walker, N., et al. 2014, PASA, 31, e045 10.1017/pasa.2014.40 2014PASA...31...45H
Hurley-Walker, N., et al. 2017, MNRAS, 464, 1146 10.1093/mnras/stw2337 2017MNRAS.464.1146H
Intema, H. T., Jagannathan, P., Mooley, K. P., & Frail, D. A. 2017, A&A, 598, A78 10.1051/0004-6361/201628536 2017A&A...598A..78I
Intema, H. T., van Weeren, R. J., Röttgering, H. J. A., & Lal, D. V. 2011, A&A, 535, A38 10.1051/0004-6361/201014253 2011A&A...535A..38I
Jacobs, D. C., et al. 2016, ApJ, 825, 114 10.3847/0004-637X/825/2/114 2016ApJ...825..114J
Jones, M., Mort, B., Dulwich, F., Wayth, R., Abeywickrema, S., & Bolton, R. 2016, SKA Engineering Change Proposal,
Kay, S. M. 1993, Fundamentals of Statistical Signal Processing: Estimation Theory (Upper Saddle River: Prentice-Hall)
Koopmans, L., et al. 2015, Advancing Astrophysics with the Square Kilometre Array (AASKA14), 12015aska.confE...1K
Lane, W. M., Cotton, W. D., Helmboldt, J. F., & Kassim, N. E. 2012, RaSc, 47, RS0K04 10.1029/2011RS004941 2012RaSc...47.0K04L
Line, J. L. B., Webster, R. L., Pindor, B., Mitchell, D. A., & Trott, C. M. 2017, PASA, 34, e003 10.1017/pasa.2016.58 2017PASA...34....3L
McKinley, B., et al. 2015, MNRAS, 446, 3478 10.1093/mnras/stu2310 2015MNRAS.446.3478M
Mesinger, A., Furlanetto, S., & Cen, R. 2011, MNRAS, 411, 955 10.1111/j.1365-2966.2010.17731.x 2011MNRAS.411..955M
Morales, M. F., & Hewitt, J. 2004, ApJ, 615, 7 10.1086/424437 2004ApJ...615....7M
Parsons, A. R., et al. 2010, AJ, 139, 1468
Perley, R. A., Dreher, J. W., & Cowan, J. J. 1984, ApJ, 285, L35 10.1086/184360 1984ApJ...285L..35P
Procopio, P., et al. 2017, PASA, 34, 33 2017PASA...34...33P
Salter, C. J., et al. 1989, A&A, 220, 42 1989A&A...220...42S
Tingay, S. J., et al. 2013, PASA, 30, 7 10.1017/pasa.2012.007 2013PASA...30....7T
Trott, C. M., Wayth, R. B., & Tingay, S. J. 2012, ApJ, 757, 101 10.1088/0004-637X/757/1/101 2012ApJ...757..101T
Trott, C. M., et al. 2016, ApJ, 818, 139 10.3847/0004-637X/818/2/139 2016ApJ...818..139T
Wayth, R. B., et al. 2015, PASA, 32, 25 10.1017/pasa.2015.26 2015PASA...32...25W
Windhorst, R., Mathis, D., & Neuschaefer, L. 1990, in ASP Conf. Ser. Vol. 10, Evolution of the Universe of Galaxies, ed. Kron, R. G. (San Francisco: ASP), 389