The magnitude and frequency of avalanches must be known, estimated or guessed at when land-use planning decisions are made concerning risk mapping or plans for facilities in avalanche-prone terrain. Magnitude is important to determine the destructive potential or vulnerability (e.g. McClung and Sehaerer, 1981, 1993) and the frequency determines the excedence probability of event occurrence at a location. Frequency is a very important component of risk: it can vary over about four orders of magnitude, from several per year to one every several hundred years. Therefore, the risk can vary over about four orders of magnitude from frequency variations alone. Risk increases in direct proportion to the frequency.
In order to specify properly the frequency component of risk in a probabilistic sense, we must know the frequency distribution of events. In this paper, we present data from an extensive data set for 43 avalanche paths at Rogers’ Pass, British Columbia, with 24 years of records for paths which show frequency in the range 3 – 21 events per year. From the data, we suggest that the Poisson distribution is the suitable probability mass function to describe avalanche frequency at Rogers’ Pass. Given a Poisson distribution to describe the frequency, it is then possible to estimate the encounter probability or frequency component of risk for a given avalanche path.
In addition to avalanche-frequency data, our studs includes terrain variables and estimates of 30 year maximum water-equivalent precipitation (starting-zone estimate) for the paths in question. Given these supporting data, we perform a multivariate correlation analysis to determine which variables are statistically significant in predicting frequency (the response variable). The analysis shows that path roughness. 30 year maximum water equivalent, location (east or west of Rogers’ Pass summit), wind exposure and run-out zone elevation and inclination are all significant in a multivariate sense. Inclination and elevation of the starting zone also have important single-variable correlations. Further details of this work have been summarized in Smith (1995).
Risk for natural hazards may be thought of in terms of consequences, chance of occurrence and exposure in time and space. In this paper, we define risk at a given location in a probabilistic sense in terms of the product of three parameters; Ph
is the probability of event occurrence (frequency). Pe
is the exposure probability (the fraction of time or space an object is exposed to the hazardous events) and Pv
is the vulnerability (the fraction of damage expected for an event of a given size). The parameter Pv
depends on the fragility of the threatened object, whether a person, structure or vehicle, as well as the magnitude of the event. All three parameters (Ph, Pe, Pv) range between 0 and 1 and are treated as if they are probabilities.
We define the specific risk, Ps, as the risk at a given location for events of a given magnitude (and event frequency) as:
From our definitions in Equation (1), Pv
depends explicitly on event magnitude, whereas Ph
is defined as the probability of occurrence of all events that reach or exceed a given location (of given magnitude). For avalanches. Pe
may also depend on event magnitude at a location since larger events may have greater spatial extent in the run-out zone. The Ph
contains the event frequency and it is defined as the excedance probability, or 1/T, where T is the return period of events of given magnitude. The probability of occurrence is usually calculated from a probability density function or probability mass function derived from assumptions and data about the event frequency. The total risk. Pt
, is defined as the sum (or integral) of the specific risk over all event magnitudes in question.
If it is assumed that Pe
is approximately independent of magnitude, for example at the centre line of an avalanche path, then the total risk at a location is
where 〈Pv〉 is the expected value of the vulnerability. The parameter 〈Pv〉 is calculated as the weighted sum of the vulnerability, over all frequencies, calculated from the magnitude-frequency distribution of events at the location.
The encounter probability is the probability of encountering an event at least once at a location, given the residence lime or spatial extent of exposure and the event-return period. It depends on estimates of Pn
, and is defined from the excedence probability and the exposure in time or space. It is possible to develop an analytical expression for the encounter probability if the frequency distribution of events is known. Below, we show that the frequency of events at Rogers’ Pass may be assumed to follow a Poisson distribution. As an example of the encounter probability, consider a residence time L at a location with an excedence probability 1/T from a Poisson distribution. The Poisson distribution may be represented as P(n,λ) where n is the number of events in time period L and λ is the Poisson parameter (L/T). The encounter probability, E, is the sum of all terms in the Poisson distribution for n ≥ 1 and is therefore (LaChapelle, 1996):
The encounter probability may be regarded as the risk of encounter and it is equal to the total risk for the special case when Pv
is one for all events which reach or exceed the location.
Description of Rogers’ Pass and Data
The data set used for this analysis comes from Rogers’ Pass, British Columbia, Canada, within Glacier National Park. The Canadian Parks Service maintains avalanche control on most of the 134 avalanche paths which affect highway and rail traffic over a linear highway distance of 45 km. Avalanches have been a problem in this region since the railroad, built to traverse the Selkirk Ranges, was completed in 1885; two tunnels have since been built under Rogers’ Pass to bypass the most severely affected areas and to ease the grade of the railway. After completion of the Trans-Canada Highway through Rogers’ Pass in 1962, further avalanche-control measures were added (Schleiss, 1989), including avalanche defences and artillery.
At Rogers’ Pass, heavy snowfall and steep terrain are combined, making it an area of high avalanche frequency. The valley on the eastern side of the pass is relatively U-shaped with up to 1050 m of relief; the western side is more V-shaped with relief up to 1800 m.
The Selkirk Ranges occupy the interior wet belt of British Columbia, receiving the second greatest recorded level of precipitation in Canada. Warm, southwesterly storm tracks produce a deep snow-pack with potential for large destructive avalanches. There are, however, periods of cold, stable air which can dominate Rogers’ Pass for long periods. Schleiss (1989) categorized three climatological sub-zones: the west side, the summit region and the east side. The west side is distinguished by relatively heavier snowfall and milder temperatures, while the east side has lighter snowfall and colder temperatures. The summit area is deemed to be transitional between the east and west. For our analysis, we have simplified the climatic zones to two: east and west of the summit.
Although the railroad dates back to 1885, accurate records of avalanche activity did not begin until 1909. Analysis of avalanche activity was not undertaken until 1953 when consideration was given to Rogers’ Pass for the location of the Trans-Canada Highway. During the winter of 1966, the National Research Council (NRC) began records of avalanche occurrence. Some records date back to 1959, although these are the exception. Some paths ceased to be monitored after 1984, with most ending in 1989. The aim in monitoring avalanches was to record all “major” occurrences, particularly those that ran out into the valley. In the late 1970s the five-part Canadian size-classification scheme (McClung and Schaerer, 1981; see Table 1), based on destructive potential, was incorporated into the observations. In general, only avalanches greater than size 2 were recorded. The data set can be considered accurate and it provides the best description of avalanche activity known to us.
For analysis, we split the Rogers’ Pass data into two sections. The first set contains 14 paths with continuous high-quality records from 1967 to 1989. The second set contains 29 paths but in this case there are some missing data. In our analysis, we ignored the missing data before the analysis was performed. In general, for the period 1966 – 89 less than 20% of data were missing for all 43 paths.
The data from Rogers’ Pass are characterized by very high frequency of avalanching. The only other study of avalanche frequency that we know of is by Föhn (1975). He demonstrated that the frequency of avalanche events on a single unforested avalanche path followed a Poisson distribution. Föhn had access to historical data from 1550 to 1970. In comparison to Rogers’ Pass avalanche paths, Föhn’s study path may be classed as low frequency (return period 20 – 30 years); the data were averaged over 30 year periods, giving a total of 14 data points. A X
2 test at 0.01 significance showed that a Poisson probability mass function fitted the data. Föhn’s was the first study to determine the type of distribution associated with event frequency on a single low-frequency path. In contrast, our study is based on event frequency for 43 high-frequency avalanche paths.
Frequency Analysis — Goodness of Fit
In order to determine how well the frequency data relate to a probability mass function or probability density function, it is appropriate to apply statistical tests of goodness-of-fit. For our Study, we used the X
2 goodness-of-fit test, by defining groups or bins of equal intervals with a minimum expected frequency of 2 (Roscoe and Bryars, 1971; see also D’Agostino and Stephens (1987) for л further discussion) for the 0.05 significance level. The Appendix prov ides further discussion on the choice of X
The frequency of avalanches may be thought of as a series of discrete, rare, independent events. These conditions match those for a Poisson experiment and so make the Poisson distribution a likely candidate to describe avalanche frequency. The condition of independence may be violated sometimes, because the probability of avalanching may be related to the occurrence of earlier avalanches. An example of the violation of the independence assumption is the recharge rate of snow in the starting zone following major avalanches. Generally, however, we feel that in a high-frequency area like Rogers’ Pass avalanches occur shortly after the deposition of new snow from a storm and the independence assumption is reasonable.
It should be noted that as avalanche frequency he comes larger the discrete nature of the data becomes less important and the data can be assumed to be continuous. Given that a normal distribution can approximate the Poisson distribution when the Poisson parameter (μ) is greater than 9 (e.g. frequency greater than nine events per year), testing for goodness-of-fit using a normal distribution is appropriate. Use of the normal distribution implies that both upper and lower values, three standard deviations from the mean, are positive (i.e. it is not possible to have negative event frequencies).
Figure 1 shows a Poisson distribution fitted to one path from Rogers’ Pass. This path shows a satisfactory fit. Figure 2 is an example of a path where the Poisson fit failed the X
2 test at the 0.05 significance level. These examples are discussed in the overall results below.
Fig. 1. Avalanche frequency for Cougar Corner 2 (light) fitted to a Poisson distribution (black). This path has и frequency of 10.l events per year. Results of X
2 test were: degrees of freedom 7, X
2 statistic 2.55 and significance level 0.923.
Fig. 2. Avalanche frequency for McDonald Gully 3 fitted to a Poisson and normal distribution. This path has a frequency of 14.3 events per year. Results of X
Results of the X
2 tests, for both the complete and incomplete data sets, show that both the Poisson and normal distributions provide a satisfactory fit for most of the 43 avalanche paths. However, in several cases both the Poisson and normal fail the test. Table 2 provides a summary of X
2 test results. From Table 2, it is notable that the normal distribution provides a better fit than the Poisson distribution at μ > 14, although at lower values of μ both perform equally well.
Table 2. Frequency table of X
2 test passes and fails (at 0.05 significance) using the equal-interval method, by avalanche-path frequency, for normal and Poisson distributions
This picture is further complicated by the use of artillery Which can artificially increase the frequency of avalanche events. Of the 43 paths, seven have received artillery control since the highway was completed, with a further live having received control more recently. Of the former seven, five have mean frequencies greater than nine avalanches per year, with four of these greater than 14. Since only ten paths have means greater than 14 avalanches per year, this is an important effect upon overall path frequency. Not all the paths follow this general trend (e.g. Fig. 2 is an example of a poor fit to a Poisson distribution). The plot in Figure 2 show s a lack of data points in the central area of the distribution, with several notable outliers in the tails of the distribution. Although the average frequency for McDonald Gully 3 (Fig. 2) is 14.3 events per year, the variability effect with some years having no avalanches and some with 28 avalanches causes a fitted Poisson curve to fail the test. The Poisson curve underestimates at the peak and in the tails. The normal, although underestimating around the distribution peak, provides a better estimate in the tails. These differences partially cancel out in the X
2 test itself, enabling the normal curve to fit the data better than the Poisson distribution.
The example in Figure 2 demonstrates the problems encountered with small data sample sizes and the influence outliers can have. In our opinion, the Poisson distribution gives a satisfactory fit to the avalanche data and is the most appropriate distribution, since the frequency of avalanches may be thought of as a series of discrete, rare, independent events matching the conditions for a Poisson experiment. For high values of frequency (μ > 14), practitioners may wish to use the normal, as it is able to model the variability in the paths we have analysed. This variability may be partly attributed to the use of artillery, in combination with the effects of outliers. A greater sample size would allow a more definite answer to this question.
If the Poisson distribution is chosen as the most appropriate distribution, it is possible to use μ (mean frequency for each data set) for each path to characterize the path frequency. Variations in ¼ from path to path may be attributed to differences in terrain parameters and climate. However, the importance of each of these differences in determining frequency is unknown. Below, we present an analysis of terrain features for the paths from Rogers’ Pass to define which parameters significantly influence avalanche frequency.
Relation Between Frequency and Terrain Parameters
Previous work (Schaerer, 1977) carried out on data from Rogers’ Pass has provided a selection of potentially important terrain parameters that have been used in this work. Schaerer (1977) analyzed avalanche frequency (using 9 years of data) on 36 of the 43 avalanche paths analyzed here, collecting data for 16 terrain and climate variables. He then attempted to correlate the avalanche frequencies with the terrain parameters.
A limiting factor in Schaerer’s (1977) work was the small data record: only 9 years of avalanche events. Given the small size of the data set used to calculate the mean frequencies, the effect of outliers on the results could well be significant. In the present study, we were able to use 24 years of records on the 43 paths originally contained in the National Research Council of Canada data set. These produced a mean frequency of 10 avalanches per year with a standard deviation of 4.5 per year. The minimum was 3 avalanches per year and the maximum 21 avalanches per year.
Schaerer (1977) noted that there is a variation in climate depending on location at Rogers’ Pass. Snowfall on the east side of the summit is about 80% of that at an equal elevation on the west side. However, he suggested that paths on the east side have higher starting zones and so may receive snowfall comparable to those on the west side. In our work, we have introduced a location parameter and several elevation parameters to study these climatic effects. Figure 3 shows the location of paths east-west of Rogers’ Pass summit with respect to avalanche frequency. Table 3 shows the terrain parameters and descriptive statistics used in the following regression analysis. Several variables require further explanation:
(1) 30 year maximum water equivalent; this variable is based on snow-depth measurements taken at several stations near Rogers’ Pass. Figure 4 shows the 30 year maximum water equivalent plotted against avalanche frequency for all 43 paths. Over a period of 15 – 20 years, maximum snow-depth and density measurements were taken once a year at six stations increasing in elevation on both the east and west sides of the summit. These data were then converted to water equivalent. The snow-depth measurements were graphed against elevation, giving a clear relationship. The 30 year maximum water equivalent was calculated from the cube-root normal distribution (recommended by the Climate Section, Atmospheric Environment Service, Canada, to stabilize the variance) for the centre of each of the avalanche-path catchments to give an estimate appropriate for a given catchment elevation.
(2) Wind exposure; a qualitative index of the magnitude of snowdrifting that can be expected in the avalanche-starting zone. Schaerer (1977) defined the following categories:
In some cases, intermediate groups may be observed. Figure 5 shows wind exposure plotted against avalanche frequency
1. Starting zone completely sheltered from wind by surrounding dense forest.
2. Starting zone sheltered by an open forest or facing the direction of the prevailing wind.
3. Starting zone an open slope with rolls and other irregularities where local drifts can form.
4. Starting zone on the lee side of a sharp ridge.
5. Starting zone on the lee side of a wide, rounded ridge or open area where large amounts of snow can be moved by wind.
(3) Roughness; expressed as the approximate water equivalent of snow required to cover rocks, shrubs and ledges in starting zones before avalanches will run. Figure 6 shows roughness plotted against avalanche frequency.
(4) Aspect; measured in 16 ordinal units. Figure 7 shows a plot of aspect with avalanche frequency; due to the clustering of northerly and southerly aspects, the data were ranked as either 1 (north) or 2 (south).
(5) Location; straight-line distances in kilometres, east-west from the Rogers’ Pass summit, to where each path dissects the highway (see Fig. 3).
Fig. 3. Location of avalanche paths at Rogers’ Pass with respect to avalanche frequency. Location is calculated as distance (km) east-west of the Rogers’ Pass summit (designated as 0), where the centre of the path dissects the Trans-Canada Highway. Negative location is west of Rogers’ Pass and positive is east.
Table 3. Predictor variables and descriptive statistics of data used in the regression analysis. See text for definition of categorical variables denoted by (*)
Fig. 4. 30 year maximum water equivalent plotted against avalanche frequency. Maximum snow-depth measurements at six different elevations on the east and west sides of Rogers’ Pass are highly correlated with elevation. This relationship is used to calculate the maximum water equivalent for the centre if the catchment for each path and then, using a cube-root normal distribution, the 30 year maximum is calculated.
The descriptive statistics for the avalanche paths in Rogers’ Pass may be compared to the low-frequency avalanche path data of McClung and Mears (1991). The avalanche paths are very steep, with an arc tangent (related to mean path slope) of 33.8 ° (this can be directly compared to the α angle used by McClung and Mears (1991)). There is also a high mean vertical drop (950 m). Variation in run-out zone elevation, in comparison to starting-zone elevation, has a low range and standard deviation, a result of the relatively small increase in elevation of the highway as it traverses the pass. As previously noted, Starting-zone elevations are different on the east and west sides of the pass, accounting for variations in this variable. The aspect of avalanche paths at Rogers’ Pass is predominantly northerly or southerly, a result of the east-west alignment of the pass.
Fig. 5. Wind exposure plotted against avalanche frequency, where the wind exposure is a qualitative index of the magnitude of snowdrifting that can he expected in the avalanche-starling zone.
Fig. 6. Roughness as a function of avalanche frequency, where roughness is the water equivalent of snow required to cover rocks, shrubs and ledges before avalanches will run. A negative correlation is displayed here: as the roughness in the starting zone increases, so the frequency decreases.
Fig. 7. Aspect of avalanche paths, using 16 ordinal units. Note the high clustering of values around south and north. For the regression analysis, each path was assigned a categorical value of either northerly or southerly.
Initially, a Pearson-product moment-correlation matrix was compiled for the 14 variables against avalanche frequency (F). Given the number of avalanche paths in the study, the results indicated that the following variables were significantly correlated with avalanche frequency:
Other variables which showed high correlations with avalanche frequency, but were not significant, were starting-zone slope and elevation. A multiple stepwise regression was then performed on the above eight variables, giving a model of the form:
with an r
2 of 0.57 and a standard error (SE) of 2.93.
In order to remove human-induced effects on avalanche frequency, only avalanche paths that had not received explosive control (25 paths) were included. A Pearson-product moment-correlation matrix, compiled for the 14 variables against avalanche frequency for the 25 avalanche paths, indicated that the following variables were significantly correlated with avalanche frequency:
A stepwise regression on the remaining 2.5 avalanche paths gave a best regression of the form:
with an r2
of 0,84 and an SE of 1.68.
As maximum water equivalent is not readily measurable, a stepwise regression performed excluding this variable gave a model of the form:
with an r2
of 0.76 and an SE of 2.02.
A plot of the predicted against the observed data showed one significant outlier. Excluding this outlier, a model of the following form w as obtained:
with an r2
of 0.85 and an SE of 1.59.
Both the full data set and the data set censored with respect to explosive control were partitioned with respect to avalanche-path location east or west of the Rogers’ Pass summit. Generally, variables significantly correlated with avalanche frequency were roughness, maximum water equivalent and wind exposure for both the east and west sides. However, the censored data sets had strong correlations with maximum water equivalent (r2
= 0.82). Interestingly, the data sets for the west side had a correlation with starting-zone slope.
Inspection of the regression equations indicates the importance of climate to avalanche frequency. The supply of snow is entered into this study as wind exposure and maximum water equivalent Both run-out zone elevation and location are related to snow supply and can be considered as normalizing variables. Run-out zone slope, aspect and roughness are the only terrain variables affecting avalanche frequency but, even for these, aspect and roughness are related to snow supply: avalanches are not highly likely until roughness features are covered and aspect may be related to lee and windward faces with respect to prevailing storm directions.
The effect of explosive control on the data is significant. Once avalanche paths that had received explosive control were excluded from the data set, the final regression equation was able to model accurately the mean number of avalanches per year. Significantly correlated variables, in addition to those in the complete data set, were path slope, vertical drop, path length and start-zone elevation. This perhaps suggests that the frequency of naturally occurring avalanches is more strongly related to terrain than for paths where explosive control is used. For these latter paths, climate, particularly snow supply, is the major influence on avalanche frequency. Excluding maximum water equivalent from the regression equation produced a good model fit to the data. This was further improved by removing one significant outlier. This result will possibly allow the specification of terrain and climate variables in order to estimate avalanche frequency within the region. This information could then be used for land management and risk-mapping applications. This model requires the inclusion of maximum water equivalent; however, as noted earlier, this is not easily measurable and so has been excluded from the final model for practical purposes.
In this study we have analyzed avalanche frequency and attempted to account for their variations with respect to terrain and climate. The X
2 test performed satisfactorily; we recommend the use of the Poisson distribution for the calculation of the encounter probability for high-frequency avalanche paths. When μ > 14, a normal distribution may be preferred by some, since it has two parameters and therefore more flexibility to fit the data, but the potential for a better fit may be outweighed by the physical conditions that avalanches are discrete events.
This choice of distribution has important implications for future application. In the risk mapping of high-frequency avalanche paths, it is now possible to use the Poisson distribution for calculation of the encounter probability or excedence probability as input to risk or land-management studies.
For the second half of our study, we concentrated on an analysis of terrain and climate with respect to avalanche frequency. For Schaerer’s (1977) study, based on a much smaller data set, he found roughness, wind exposure, fracture-point incline (not used in this study) and incline of track to be significantly correlated with frequency, suggesting that climate variables (roughness and wind exposure), along with terrain parameters (fracture-point incline and track incline), are the most important variables affecting frequency.
This study has presented roughness, maximum water equivalent and location in the final regression model for the full data set. Censoring the data, with respect to explosive control, has presented path length, start-zone elevation, roughness and maximum water equivalent in the final regression model. Maximum water equivalent and roughness are the most important variables, as their correlation coefficients are the highest. As both of these variables are linked to snowfall supply, avalanche frequency appears to be strongly related to climate. Run-out zone elevation, start-zone elevation and location act as normalizing variables, with path slope, run-out zone slope, path length and vertical drop all important terrain parameters. These latter terrain variables are all statistically significant; however, their single-variable correlation coefficients are notably lower than the normalizing or climate variables used in this study and appear to have a secondary effect on path frequency, We recommend the use of the final regression model to predict avalanche frequency regionally. This can then be used to specify the Poisson distribution in the calculation of the encounter probability for high-frequency paths.
We should like to thank P. Schaerer for data used in this analysis, as well as for lending his personal knowledge of Rogers’ Pass. Many thanks to J. Henkelman (Department of Computer Science, University of British Columbia) for Writing the X
2 testing programme. We are also grateful to the avatanche-control personnel at Glacier National Park for data used and for sharing their personal knowledge of the area. This work was supported by the Natural Science and Engineering Research Council of Canada.
* Present address: Department of Geography, University of Sheffield, Winter Street, Sheffield S10 2TN, U.K.
Cochran., W. G.
1954. Snme methods of strengthening the common chisquare tests. Biometrics, 10. 417–451.
D’Agostino,, R. and Stephens, M. 1987. Goodness of fit techniques. New York, Dekker.
1975. Statistische Aspekte bei Lawinenereignissen
In Proceedings, Internationales Syymposium “Interpraevent 1975”, Innsbruck. Vol. 2, 293–304.
Koehler,, K. J. and Larntz, K. 1980. An empirical investigation of goodness-of-fit statistics for sparse multinomials. J. Am/ Stat. Assoc., 75, 336–344.
LaChapelle,, E. R.
1966. Encounter probabilities for avalanche damage. Alta, UT, U.S.D.A. Forest Service Wasatch National Forest. Alta Avalanche Study Center (Miscellaneous Report 10.)
McClung., D. M. and Mears, A. I. 1991. Extreme value prediction of snow avalanche runout
Cold Reg. Sci. Technol., 19(2), 163–175.
McClung., D. M. and Schaerer, P. A. 1981. Snow avalanche size classification, In Canadian Avalanche Committee ed. proceedings of Avalanche Workshop
3 – 5 November 1980, Vancouver, Ottawa, Ont., National Research Council of Canada. Associate Committee on Geotechnical Research, 12–27. (ACGR Technical Memorandum 133.)
McClung., D. M. and Schaerer, P. A. 1993. The avalanche handbook. Seattle, WA, The Mountaineers.
Roscoe., J. T and Bryars, J. A. 1971. An investigation of the restraints with respect to sample size commonly imposed on the use of the chi-square statistic. J. Am. Stat. Assoc., 66, 755–759.
1977. Analysis of snow avalanche terrain. Can. Geotech. J., 14(3), 281–287.
1989. Rogers Pass snow avalanche atlas: Glacier National Park, British Columbia, Canada. Revelstoke, B.C., Environment Canada. Canadian Parks Service.
1995. Frequency and terrain factors for high frequency snow-avalanchc paths. (M.Sc. thesis, University of British Columbia.)
Appendix X2 Testing Procedure
In order to determine how well a particular probability mass function or probability density function represents observational data, it is appropriate to apply statistical tests of goodness-of-fit. For our study we used the X
2 goodness-of-fit test. Due to the size of the compiled data set (24 years of data for 43 separate paths), a computer program was written to perform both equal-probability and equal-interval methods of X
For the X
2 test, the data are split up into groups or bins. The test can be performed by either choosing bins that are the same size (equal interval) or choosing bins that have an equal probability of observing a value within the range of the bin. Tests using bins of equal probability are considered to be more robust; however, when dealing with integer data (or using an integer distribution such as the Poisson) or small sample data sets, bins of equal interval are more appropriate (D’Agostino and Stephens, 1987). For each bin, it is necessary to define a minimum expected frequency. Standard procedures recommend a minimum expected frequency of 5 as necessary to perform a robust X
2 test. This is not always possible when using data with small sample sizes. Furthermore, D’Agostino and Stephens (1987) considered a minimum expected frequency of 5 too conservative. Three main alternatives have been suggested:
(1) All expected frequencies at least 1 and 80% at least 5 (Cochran, 1954).
(2) Mean value of the expected frequencies at least 1 for the equal-probability method and at least 2 for the equal-interval method(for the 5% test; Roscoe and Bryars. 1971).
(3) At least three bins (M), at least ten observations (n) and n3/M at least 10 (Koehler and Larntz, 1980). D’Agostino and Stephens suggested that the method of Roscoe and Bryars performs robustly and consequently this method has been adopted in this study.
2 tests were performed employing the equal-interval method, using minimum expected frequencies of 5 and 2, and the equal-probability method, using minimum expected frequencies of 5, 3 and 1. Results demonstrated that the tests were not robust when using the equal-probability method, due to the small sample size. For the equal-interval method, at a minimum expected frequency of 5, results indicated that most paths failed the goodness-of-fit test. All final results employ the equal-interval method with a minimum expected frequency of 2.