Hostname: page-component-7c8c6479df-p566r Total loading time: 0 Render date: 2024-03-28T11:34:27.237Z Has data issue: false hasContentIssue false

Antarctic sea ice: a self-organizing map-based perspective

Published online by Cambridge University Press:  14 September 2017

David B. Reusch
Affiliation:
EMS Earth and Environmental Systems Institute, The Pennsylvania State University, University Park, PA 16802-7501, USA E-mail dbr@geosc.psu.edu
Richard B. Alley
Affiliation:
Department of Geosciences and EMS Earth and Environmental Systems Institute, The Pennsylvania State University, University Park, PA 16802-7501, USA
Rights & Permissions [Opens in a new window]

Abstract

Self-organizing maps (SOMs) provide a powerful, non-linear technique to optimally summarize a complex geophysical dataset using a user-selected number of ‘icons’ or SOM states, allowing rapid identification of preferred patterns, predictability of transitions, rates of transitions, and hysteresis in cycles. The use of SOMs is demonstrated here through application to a 24 year dataset (1973–96) of monthly Antarctic sea-ice edge positions. Variability in sea-ice extent, concentration and other physical characteristics is an important component of the Earth’s dynamic climate system, particularly in the Southern Hemisphere where annual changes in sea-ice extent (temporarily) double the size of the Antarctic cryosphere. SOM-based patterns concisely capture the spatial and temporal variability in these data, including the annual progression of expansion and retreat, a general eastward propagation of anomalies during the winter, and sub-annual variability in the rate of change in extent at different times of the year (e.g. retreat in January is faster than in November). There is also often a general seasonal hysteresis, i.e. monthly anomalies during cooling follow a different spatial path than during warming.

Type
Research Article
Copyright
Copyright © The Author(s) [year] 2017

Introduction

Many techniques are useful for extracting patterns from a large geophysical dataset, such as the monthly-mean anomalies of the position of the Antarctic sea-ice edge considered here. It may prove useful and informative, for example, to note the strength of the loading of the data from a particular month on the first two principal components of the dataset. It may also be instructive to note that the data from a given month are very similar to those of some earlier, well-known month (e.g. ‘this looks like the ice that trapped Endurance in the Weddell Sea ice pack in January 1915’). In this hypothetical example, ‘January 1915’ serves as an icon, to which other data fields can be compared. Such icon-based classification schemes can work in parallel with pattern-extraction tools such as principal components analysis (PCA). However, a particular data field (such as ‘January 1915’) may contain unique features that reduce its utility as a more general icon. To overcome this difficulty, the technique of self-organizing map (SOM) analysis (Reference KohonenKohonen, 2001) provides an objective way to optimally extract a user-specified number of icons or SOM states from an input dataset.

In short, a SOM analysis produces a small number (e.g. 30) of states representing variability in the input data, in this case sea-ice distribution. Each data point (i.e. each monthly measurement of sea-ice extent) corresponds with (or maps to) exactly one of these 30 states (the one to which it is closest). The 30 states are chosen in some optimal way (based on the theory of neural networks) to represent the original data in a generalized sense. The states are arranged as a rectangular grid, with spacing between adjacent states related to their similarity (a result of the method by which they are extracted from the data). By applying SOMs to sea-ice edge data, we develop an average sea-ice cycle, based on the monthly SOM states, and information about variability, both of which are described in more detail below.

Self-Organizing Maps and Climate Analysis

SOMs (Reference KohonenKohonen, 2001) are an analysis tool from the field of artificial neural networks that uses so-called unsupervised training to find salient features of an input dataset without prior specification (or knowledge) of the ‘correct’ output. Once trained, the input data may be broken up into distinct classes (as defined by the network itself) using the derived features (i.e. classification), or it may be that the patterns found by the network are of most interest. SOMs thus support unsupervised classification of large, multivariate geophysical datasets through creation of a spatially organized set of generalized patterns of variability which, collectively, represents the probability density function (PDF) of the input data (e.g. Reference Hewitson and CraneHewitson and Crane, 2002; Reference Reusch, Hewitson and AlleyReusch and others, 2005b). In short, a SOM analysis produces a discrete, non-linear classification of the continuum input data.

SOM analysis returns the icons or SOM states in a grid or ‘map’, with similar states placed near each other, and the most extreme states at the corners. Often the states at the ends of one diagonal are similar to the positive and negative phases of the first principal component of the input data, with the second principal component correspondingly at the ends of the other diagonal; however, this is not required. In the analysis here, for example, the extended sea ice of winter and the reduced sea ice of summer are placed at opposite ends of one diagonal of the map, important asymmetries in this seasonal pattern are highlighted on the other diagonal, and intermediate states capture more of the rich behavior of the system. Because each input dataset maps uniquely to one SOM state, SOM analysis easily allows characterization of time-trends in frequency of occurrence, preferred transitions that may point toward predictability, and hysteresis of preferred patterns.

SOMs provide an alternative to more traditional linear techniques (e.g. PCA), that is more robust (e.g. able to interpolate into areas of the input space not present in the available training input), less complex and less subjective while also accommodating non-linear relationships in the data (Reference Reusch, Alley and HewitsonReusch and others, 2005a).

SOMs are also a completely independent uniformitarian analysis pathway and thus provide independent results for comparison with more traditional techniques. SOM-based analysis thus complements linear techniques without replacing them. SOMs also provide a powerful visualization approach for studying structure in large, complex datasets. In the case of atmospheric circulation data, for example, the patterns capture the full range of synoptic conditions while also treating the data as a continuum, unlike, for example, cluster analysis.

Sea Ice

Sea ice has long been known to be a very important component of the global climate system, and is expected to be a sensitive indicator of a warming climate (Reference HoughtonHoughton and others, 2001). Changes in sea-ice properties (e.g. extent, thickness, annual cycle) have complex, often bidirectional relationships with many other physical (e.g. Reference Kwok and ComisoKwok and Comiso, 2002; Reference Wu, Wang and WalshWu and others, 2004), chemical (e.g. Reference Wolff, Rankin and RöthlisbergerWolff and others, 2003) and biological (e.g. Reference Thomas and DieckmannThomas and Dieckmann, 2002) systems in the polar regions, such as ocean–atmosphere heat fluxes, freshwater advection, deep water formation and a wealth of ecological networks. Human activities are also often subject to the vagaries of sea-ice anomalies (e.g. Reference ShackletonShackleton, 1920; Reference Turner, Harangozo, Marshall, King and ColwellTurner and others, 2002). Recent changes in Arctic extent and thickness (in particular, strong downward trends) have been more dramatic (ACIA, 2004), but there is also evidence for changes in the Antarctic. Whether regional changes have only regional causes or are also tied to larger phenomena such as the Antarctic Oscillation (AAO) or El Nino–Southern Oscillation (ENSO) remains uncertain (e.g. Reference Liu, Curry and MartinsonLiu and others, 2004). Longterm trends have also been difficult to identify, with both weak increases (in the 1970s) and decreases (after 1979) being identified (e.g. Reference HoughtonHoughton and others, 2001; Reference Cavalieri, Parkinson and VinnikovCavalieri and others, 2003). In both cases, the shortness of the Antarctic sea-ice record has been an important factor. Unfortunately, satellite remote sensing is the only way to quickly and reliably capture comprehensive, synoptic data on most sea-ice characteristics, such as extent and concentration. Thus, improved understanding will, in part, require our patience as continued satellite observations build on the existing record. While awaiting this longer record, analysis of the existing data using new tools can still be instructive. The new tools used may help supplement existing understanding of sea ice in the climate system.

Data

Sea-ice data were obtained from the Australian Antarctic Data Centre (Reference Simmonds and JackaSimmonds and Jacka, 1995; T.H. Jacka, http://aadc-maps.aad.gov.au/aadc/metadata/metadata_redirect.cfm?md=AMD/AU/sea_ice_extent_gis). For each month between 1973 and 1996, the latitude of the northern edge of the sea ice is recorded for each 10˚ of longitude, for 0–350˚. Prior to analysis, data were standardized by subtracting the full-record mean and dividing by its standard deviation, at each longitude. Sea-ice edge anomalies are plotted as boxes covering ±5˚ east–west of the center longitude and a north–south extent based on the full-record mean and the anomaly from the mean at each longitude (black for positive anomalies and gray for negative anomalies, as in Fig. 1).

Fig. 1. Generalized patterns from a 6 × 5 SOM analysis. Sea-ice edge values are drawn as anomalies from the full-record mean for each 10˚ longitude band. Black (light gray) areas indicate an edge farther north (south) than the climatological mean, with the area between the edge and the mean (the anomaly) filled in. East–west extent is centered on the longitude of the data point.

Methodology: Self-Organizing Maps

SOM usage begins by creating a set of generalized patterns from the input data, an iterative process known as training. Mathematically, a SOM is composed of a finite set of nodes, organized as a grid (usually rectangular, sometimes hexagonal), with each node having an associated reference vector representing the node’s generalized pattern. Reference vectors have the same dimensionality as the original data. Because the pattern set is relatively small (and finite), complexity is reduced to working with the set of reference vectors instead of the (usually much larger) original dataset. The generalized patterns (or states) are a projection of the multidimensional input data onto the two-dimensional (2-D) array of reference vectors. The size of the grid (number of states) directly influences the amount of generalization: smaller (larger) node arrays have fewer (more) available states to characterize the n-dimensional data space, so the final patterns developed during training will tend to do more (less) generalization of the input. (Grid size is thus a first-order experimental parameter.)

In practice, a SOM is usually referred to by its grid dimensions (e.g. a 4 × 3 SOM, which has 12 nodes). SOM patterns may be identified by an (x,y) coordinate pair or a sequence number within the two-dimensional array (counting left-to-right, top-to-bottom). Coordinate pairs identify patterns consistently across different grid sizes while sequence numbers have notational simplicity.

Because relative distances between nodes in data space are actually variable (as a function of the information content and distribution of the raw data), Sammon maps (Reference SammonSammon, 1969) are often used to visualize these relative distances. Sammon mapping, as used here, projects the multidimensional reference vectors for each generalized SOM pattern into a 2-D space. This aids visualization of inter-pattern relationships by being able to plot the SOM nodes based on relative neighbor-to-neighbor similarity, rather than using the simpler, but less informative, regularly spaced grid format normally used. However, the regularly spaced grid is adequate for the purposes of displaying relative frequencies and related attributes of each node.

A key step in SOM training, and in later analyses using the generalized patterns, is the mapping of input data samples to the closest matching reference vector (usually based on the Euclidean distance between the input and the reference vectors). During training, this step identifies which reference vectors (and neighbors) are to be updated. After training, mapping is used to classify the input data (since input records with common patterns will map to the same or nearby SOM nodes) and is widely used to study, for example, frequency-of-occurrence characteristics of different subsets of the data (based on time or other criteria, such as state of the Southern Oscillation Index (SOI)), also known as frequency mapping.

Details on SOM training are readily available in the literature (e.g. Reference Hewitson and CraneHewitson and Crane, 2002; Reference Reusch, Hewitson and AlleyReusch and others, 2005b). (However, because the application of this technique is still relatively new to the Earth sciences, we recommend reviewing multiple sources for their perspectives on best practices.) Briefly, an iterative and unsupervised process is used to adjust the reference vectors representing each SOM node (or state) based on distances (differences) between best-matching reference vectors and each input record. From a randomly (i.e. evenly distributed across the input data space) or first-two-eigenvectors-of-the-training data-based initialization, the reference vectors will, when trained, each represent a distinct portion, or sub-space, of the multidimensional input space. (Unlike summation of PCA components, the sum of these reference vectors does not reconstruct the original input data.) Because neighbors of the best match are also updated during training (but to a lesser degree), adjacent nodes have the strongest similarity, with similarity decreasing with increasing distance away from any given node. For mathematical stability during the learning process, the SOM grid is normally asymmetric rather than being square (e.g. 5 ×4 or 5 × 3, not 5 ×5 or 4 × 4). Ideally, the grid dimensions match well with the shape of the 2-D projection of the input dataset’s probability density function (Reference KohonenKohonen, 2001), but this is not always known in advance and is not normally a significant issue.

The freely available SOM-PAK software (Reference KohonenKohonen, 2001) has been used here as in previous work (e.g. Reference Reusch, Hewitson and AlleyReusch and others, 2005b).

Overview of SOM Results

Generalized patterns of a 6 × 5 SOM of the full sea-ice edge record are shown in Figure 1 as anomalies from the climatological full-record (1973–96) mean. Variability in the record is captured by changes in the magnitude (overall extent) and spatial characteristics (longitude of maxima/minima) of the patterns. The lefthand column encompasses longitudinal variations among patterns with roughly comparable overall spatial extent (changes near the Antarctic Peninsula may be easiest to see). The second column shows similar characteristics for not-quite-maximum extent patterns. Variability in the minimal sea-ice edge position is focused more on the patterns near the upper right corner. Climatologically ‘average’ patterns are found primarily in column four.

Although not the focus of this paper, a generalized, SOM-based reconstruction of the sea-ice edge can readily be created from the best matching SOM pattern for each month in the observations. Statistical comparisons of this new (smoothed) time series with the observational time series indicate that the reconstruction captures all the main features of the original data.

The Annual Cycle in Sea-Ice Edge

As an example of the utility of SOMs in studying climate data, we present in this section an analysis of the annual cycle in sea-ice edge based on the SOM patterns of Figure 1. A common approach to studying dataset variability using SOMs is shown in Figure 2: summarizing frequency of pattern occurrence for data subsets using a simplified SOM grid, i.e. making frequency maps. Subsets may be temporal (e.g. by month) or based on some other index relevant to the data (e.g. high/low values of the SOI). Here we study the annual cycle by creating frequency maps with data from representative months in each season (March, June, September, December). Because only the full-record mean was removed, rather than the monthly means, the SOM patterns still reflect the variability of the annual cycle. In the simplified SOM grids (Fig. 2, lefthand column), SOM nodes are indicated by small squares (instead of the actual patterns). Numbers within the squares indicate pattern frequency, or how often that SOM state was ‘occupied’ in a particular month, during the 24 years of data coverage. (For each monthly sea-ice edge position in the observations, the best matching SOM pattern is deemed to be occupied for that month.) The most common pattern is indicated by a shaded box in the lefthand column, with the corresponding generalized SOM pattern shown in the middle column.

Fig. 2. Selected frequency maps with representative SOM patterns and observations for each season. Lefthand column shows pattern counts (frequencies) for data from four months of the year on a simplified SOM grid (each box corresponds to a generalized pattern). The highest-frequency (most common) pattern has a gray background and an arrow to the actual pattern (middle column) for that grid position. The righthand column shows representative observational data for which the leading SOM pattern in each month is the best match.

The righthand column of Figure 2 shows a representative observation that matches best with each of the SOM patterns in the middle column. Because the SOM creates generalized patterns, the matches are not exact for any given observation (and are not expected to be). Instead, these examples show that the SOM patterns are quite reasonable (and in some sense optimal) representations of the original data.

The frequency maps in Figure 2 show that each month has one (sometimes two) dominant SOM state. For example, 15 of the 24 Decembers exhibit the pattern with sea ice more extensive than the full-record mean position in the Ross and Weddell Seas, and less extensive elsewhere. Furthermore, most of the remaining Decembers exhibit similar patterns (Fig. 1), with only one giving a notably different SOM state. The frequency maps also show the range of monthly variability in the record. For example, March has the fewest (four) patterns of the four example months, suggesting reduced variability at this time of minimal sea-ice extent. June and September tend to be more variable (i.e. more patterns are seen in the 24 year record), with September variability tending to be more regional, i.e. the longitude of maximum extent varies more between patterns than the overall spatial extent. This approach can then be extended to all months, to explore the annual cycle of sea-ice variability.

As mentioned above, relative distances between nodes in data space are actually variable, and it is more informative to view the annual cycle of monthly patterns in a so-called ‘SOM space’, i.e. in a 2-D, topology preserving space defined by a Sammon mapping (Sammon, 1969) of the SOM’s reference vectors. Figure 3a shows the Sammon map of the 6 × 5 SOM. The dominant feature of this mapping is the greatly elongated left/right dimension, which reflects the large differences between minimal and maximal sea-ice edge patterns. Furthermore, the longest dimension of the grid is between the upper left and upper right corners, again reflecting the strong differences between September (upper left, Fig. 2) and March (upper right) sea-ice conditions. The Sammon map also emphasizes the similarities among patterns at the left and right sides of the grid as well as the particularly tight cluster of patterns at the upper right. The latter can be labeled as ‘March’ using the frequency maps of Figure 2, further showing the reduced variability in this month.

Fig. 3. The annual cycle of sea-ice edge. (a) The Sammon map of 6 × 5 SOM patterns showing relative locations of SOM patterns using a 2-D, topology-preserving projection of the reference vectors following Reference SammonSammon (1969). (b) A generalized annual cycle as defined by the most common pattern(s) for each month, connected in temporal order and plotted on Sammon map coordinates. Dominant patterns for selected months (Fig. 2) are also shown.

Additionally, combining the monthly frequency maps (e.g. Fig. 2) with the Sammon map (Fig. 3a) brings together the temporal and SOM-space characteristics of the sea-ice edge dataset (Fig. 3b). In this version of the annual cycle, the differences between the warm (upper right) and cold (lower left) seasons stand out clearly as opposite, well-separated areas of the SOM grid. Because this format incorporates both time (month-to-month) and distance (in arbitrary units based on similarity), a velocity in SOM space can be applied to each monthly transition. For example, retreat transitions during spring (September–October, October–November) are slower than transitions in the peak retreat season (November–December, December–January). Expansion also begins relatively slowly (arrows moving from the upper right corner), accelerates through May and June as a larger spatial area becomes active, and finally slows as the active area decreases and northernmost areas begin to retreat.

A comparison of the month-to-month differences in the dominant monthly patterns helps clarify the spatial patterns of change of the sea-ice edge during a generalized annual cycle (Fig. 4). Expansion (March–August) begins in the Ross and Weddell Sea regions in early fall and occurs at all longitudes by early winter, with the largest changes in the peninsula region. Mid- to late winter sees only small changes. Sea-ice retreat (September–February) begins around the eastern Antarctic Peninsula (August) while expansion is still occurring in the Amundsen and Bellingshausen Seas (ABS) and around much of East Antarctica. Retreat through the rest of the spring and summer continues in the eastern ABS and Wilkes Land regions, followed by the western ABS, Ross Sea and the rest of East Antarctica until the minimum is reached everywhere in late summer.

Fig. 4. Cartoon of the annual cycle in sea-ice edge.

Concluding Remarks

Much progress in glaciology has come from improved ability to ‘see’ what is going on, using, for example, ice-penetrating radar or seismic techniques to collect data on the glacier bed, or satellite imagery to characterize the ice surface. Most techniques of data analysis extract features of the dataset without allowing the user to ‘see’ the full patterns. SOM analysis optimally simplifies a large and complex dataset into a few patterns that a user can see. The study of frequency of occurrence, preferred transitions, rates of change, hysteresis and other features is then made much easier.

In this exploration of area anomalies of Antarctic sea-ice coverage, we find that March patterns (and December patterns, with one outlier) are highly reproducible, but that greater variability is exhibited in June and September. Sea-ice growth and shrinkage exhibit strong hysteresis, with a general eastward propagation of anomalies. These results are not entirely surprising, but we hope that they illustrate the insights that can be gained from such analyses.

Acknowledgements

This work is supported by US National Science Foundation grant ATM 04-25592 to D.B. Reusch and R.B. Alley. We also acknowledge partial support from the G. Comer Foundation and the Center for Remote Sensing of Ice Sheets (NSF-0424589). Sea-ice edge data used within this paper were obtained from the Australian Antarctic Data Centre (IDN Node AMD/AU), a part of the Australian Antarctic Division (Commonwealth of Australia). The data are described in the metadata record ‘Antarctic CRC and Australian Antarctic Division Climate Dataset – Northern extent of Antarctic sea ice’ (T.H. Jacka, http://http://aadc-db.aad.gov.au/Keyword-Search/ Home.do?Portal=amd_au&MetadataType=0). We are also grateful to the US National Center for Atmospheric Research (NCAR) Visualization and Enabling Technologies Section, Scientific Computing Division, for their tireless support of NCL (the NCAR Command Language).

References

Arctic Climate Impact Assessment (ACIA). 2004. Arctic climate impact assessment: scientific report. Cambridge, etc., Cambridge University Press.Google Scholar
Cavalieri, D.J., Parkinson, C.L. and Vinnikov, K.Y.. 2003. 30 year satellite record reveals contrasting Arctic and Antarctic decadal sea ice variability. Geophys. Res. Lett., 30(18), 1970. (10.1029/2003GL018031.)CrossRefGoogle Scholar
Hewitson, B.C. and Crane, R.G.. 2002. Self-organizing maps: applications to synoptic climatology. Climate Res., 22(1), 1326.Google Scholar
Houghton, J.T. and 7 others, eds. 2001. Climate change 2001: the scientific basis. Contribution of Working Group I to the Third Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge, etc., Cambridge University Press.Google Scholar
Kohonen, T. 2001. Self-organizing maps. Third edition. Berlin, etc., Springer-Verlag. (Springer Series in Information Sciences 30.)CrossRefGoogle Scholar
Kwok, R. and Comiso, J.C.. 2002. Southern Ocean climate and sea ice anomalies associated with the Southern Oscillation. J. Climate, 15(5), 487501.2.0.CO;2>CrossRefGoogle Scholar
Liu, J., Curry, J.A. and Martinson, D.G.. 2004. Interpretation of recent Antarctic sea ice variability. Geophys. Res. Lett., 31(2), L02205. (10.1029/2003GL018732.)Google Scholar
Reusch, D.B., Alley, R.B. and Hewitson, B.C.. 2005a. Relative performance of self-organizing maps and principal component analysis in pattern extraction from synthetic climatological data. Polar Geogr., 29(3), 188212.CrossRefGoogle Scholar
Reusch, D.B., Hewitson, B.C. and Alley, R.B.. 2005b. Towards ice-core-based synoptic reconstructions of West Antarctic climate with artificial neural networks. Int. J. Climatol., 25(5), 581610.CrossRefGoogle Scholar
Sammon, J.W. 1969. A non-linear mapping for data structure analysis. IEEE Trans. Computers, C-18(5), 401409.Google Scholar
Shackleton, E.H. 1920. South: the story of Shackleton’s last expedition 1914–1917. New York, Macmillan.Google Scholar
Simmonds, I. and Jacka, T.H.. 1995. Relationships between the interannual variability of Antarctic sea ice and the Southern Oscillation. J. Climate, 8(3), 637647.Google Scholar
Thomas, D.N. and Dieckmann, G.S.. 2002. Antarctic sea ice – a habitat for extremophiles. Science, 295(5555), 641644.Google Scholar
Turner, J., Harangozo, S.A., Marshall, G.J., King, J.C. and Colwell, S.R.. 2002. Anomalous atmospheric circulation over the Weddell Sea, Antarctica during the Austral summer of 2001/02 resulting in extreme sea ice conditions. Geophys. Res. Lett., 29(24), 2160. (10.1029/2002GL015565.)CrossRefGoogle Scholar
Wolff, E.W., Rankin, A.M. and Röthlisberger, R.. 2003. An ice core indicator of Antarctic sea ice production? Geophys. Res. Lett., 30(22), 2158. (10.1029/2003GL018454.)Google Scholar
Wu, B., Wang, J. and Walsh, J.. 2004. Possible feedback of winter sea ice in the Greenland and Barents Seas on the local atmosphere. Mon. Weather Rev., 132(7), 18681876.2.0.CO;2>CrossRefGoogle Scholar
Figure 0

Fig. 1. Generalized patterns from a 6 × 5 SOM analysis. Sea-ice edge values are drawn as anomalies from the full-record mean for each 10˚ longitude band. Black (light gray) areas indicate an edge farther north (south) than the climatological mean, with the area between the edge and the mean (the anomaly) filled in. East–west extent is centered on the longitude of the data point.

Figure 1

Fig. 2. Selected frequency maps with representative SOM patterns and observations for each season. Lefthand column shows pattern counts (frequencies) for data from four months of the year on a simplified SOM grid (each box corresponds to a generalized pattern). The highest-frequency (most common) pattern has a gray background and an arrow to the actual pattern (middle column) for that grid position. The righthand column shows representative observational data for which the leading SOM pattern in each month is the best match.

Figure 2

Fig. 3. The annual cycle of sea-ice edge. (a) The Sammon map of 6 × 5 SOM patterns showing relative locations of SOM patterns using a 2-D, topology-preserving projection of the reference vectors following Sammon (1969). (b) A generalized annual cycle as defined by the most common pattern(s) for each month, connected in temporal order and plotted on Sammon map coordinates. Dominant patterns for selected months (Fig. 2) are also shown.

Figure 3

Fig. 4. Cartoon of the annual cycle in sea-ice edge.