您的当前位置：首页 Evaluating density forecasts of inflation the Survey of Professional Forecasters

Evaluating density forecasts of inflation the Survey of Professional Forecasters

来源：筏尚旅游网

Evaluating Density Forecasts of Inflation: The Survey of Professional Forecasters

# †

Francis X. Diebold*, Anthony S. Tay and Kenneth F. Wallis

Department of Economics

University of Pennsylvania

3718 Locust Walk

Philadelphia, PA 19104, USA

and NBER

Department of Economics and StatisticsNational University of Singapore

10 Kent Ridge CrescentSingapore 119260

†

Department of EconomicsUniversity of WarwickCoventry CV4 7AL

England

This Print: September 11, 1998

Copyright © 1997 F.X. Diebold, A.S. Tay and K.F. Wallis. This paper is available on theWorld Wide Web at http://www.ssc.upenn.edu/~diebold/ and may be freely reproduced foreducational and research purposes, so long as it is not altered, this copyright notice isreproduced with it, and it is not sold for profit.

Abstract: Since 1968, the Survey of Professional Forecasters has asked respondents toprovide a complete probability distribution of expected future inflation. We evaluate the

adequacy of those density forecasts using the framework of Diebold, Gunther and Tay (1998). The analysis reveals several interesting features of the density forecasts in relation to realizedinflation, including several deficiencies of the forecasts. The probability of a large negativeinflation shock is generally overestimated, and in more recent years the probability of a largeshock of either sign is overestimated. Inflation surprises are serially correlated, althoughagents eventually adapt. Expectations of low inflation are associated with reduceduncertainty. The results suggest several promising directions for future research.

Acknowledgments: Dean Croushore and Tom Stark provided valuable assistance and advice,and an anonymous referee provided helpful comments. We thank the National Science

Foundation, the Sloan Foundation, the University of Pennsylvania Research Foundation, theNational University of Singapore and the Economic and Social Research Council for support. Our collaboration was initiated at the 1997 UC San Diego Conference on FinancialEconometrics organized by Rob Engle, Clive Granger and Gloria Gonzalez-Rivera.

1. Introduction

Economic decision makers routinely rely on forecasts to assist their decisions. Untilrecently, most forecasts were provided only in the form of point forecasts, althoughforecasters sometimes attached measures of uncertainty, such as standard errors or meanabsolute errors, to their forecasts. Recently, the trend has been to accompany point forecastswith a more complete description of the uncertainty of the forecasts, such as explicit intervalor density forecasts. An interval forecast indicates the likely range of outcomes by specifyingthe probability that the actual outcome will fall within a stated interval. The probability maybe fixed, at say 0.95, and the associated interval may then vary over time, or the interval maybe fixed, as a closed or open interval, and the forecast probability presented, as in thestatement that “our estimate of the probability that inflation next year will be below 2.5 percent is p.” A density forecast is stated explicitly as a density or probability distribution. Thismay be presented analytically, as in “we estimate that next year’s inflation rate is normallydistributed around an expected value of two per cent with a standard deviation of one percent,” or it may be presented numerically, as when a histogram is reported.

Density forecasts were rarely seen until recently but are becoming more common. Infinance, practical implementation of recent theoretical developments has dramaticallyincreased the demand for density forecasts; the booming field of financial risk management,for example, is effectively dedicated to providing density forecasts of changes in portfoliovalue, as revealed by a broad reading of literature such as J.P. Morgan (1996). There is also agrowing literature on extracting density forecasts from options prices, which includes Aït-Sahalia and Lo (1998) and Söderlind and Svensson (1997). In macroeconomics, there has

also been increased discussion of density forecasts recently, in response to criticism of thelack of transparency of traditional forecasting practice, and to demands for acknowledgmentof forecast uncertainty in order to better inform the discussion of economic policy. Macroeconomic density forecasts are the subject of this article.

In the United States the Survey of Professional Forecasters has, since its introductionin 1968, asked respondents to provide density forecasts of inflation and growth. In the earlydays of the survey these received little attention, with the notable exception of Zarnowitz andLambros (1987); more recently the distributions, averaged over respondents, have featured inthe public release of survey results. In the United Kingdom the history is much shorter. InNovember 1995 the National Institute of Economic and Social Research began to augment itslong-established macroeconomic point forecasts with estimates of the probability of thegovernment’s inflation target being met and of there being a fall in GDP. This was extendedin February 1996 to a complete probability distribution of inflation and growth forecasts. Inthe same month the Bank of England launched the presentation of an estimated probabilitydistribution of possible outcomes surrounding its conditional projections of inflation. InNovember 1996 the Treasury’s Panel of Independent Forecasters, following repeatedsuggestions by one of the present authors, reported its individual members’ density forecastsfor growth and inflation, using the same questions as the U.S. Survey of ProfessionalForecasters. Our success was short-lived, however, as the new Chancellor of the Exchequerdissolved the panel shortly after taking office in May 1997.

The production and publication of any kind of forecast subsequently requires anevaluation of its quality. For point forecasts, there is a large literature on the ex-post

evaluation of ex-ante forecasts, and a range of techniques has been developed, recentlysurveyed by Wallis (1995) and Diebold and Lopez (1996). The evaluation of intervalforecasts has a much newer literature (Christoffersen, 1998), as does the evaluation of densityforecasts. In this article we use the methods of Diebold, Gunther and Tay (1998), augmentedwith resampling procedures, to evaluate the density forecasts of inflation contained in theSurvey of Professional Forecasters. Forecasts of inflation are of intrinsic interest, especiallyin the monetary policy regime of inflation targeting that is common to many OECD

economies, and it is also of interest to demonstrate the use of new tools for forecast evaluationand their applicability even in very small samples. As with most of the forecast evaluationliterature we pay no attention to the construction of the forecast, and consider only theassessment of its adequacy, after the fact. That is, because little is known about theconstruction of the density forecasts reported by the survey respondents, we concentrate onthe outputs, not the inputs. The density forecast could be based on a formal statistical oreconometric model, an ARCH model for a single financial time series or a large-scalemacroeconometric model for aggregate macroeconomic variables, for example, or it could bebased on more subjective approaches, blending the forecaster’s judgement informally with amodel-based forecast or using expert elicitation methods.

The remainder of this article is organized as follows. In section 2 we present a briefdescription of the Survey of Professional Forecasters, its advantages and disadvantages,leading to our selection of the series of first-quarter current-year mean density forecasts ofinflation for evaluation. In section 3 we develop our evaluation methods, based on the seriesof probability integral transforms of realized inflation with respect to the forecast densities

and the null hypothesis that this is a series of independent uniformly distributed randomvariables. We present the results in section 4, and we conclude in section 5. 2. The Survey of Professional Forecasters

The Survey of Professional Forecasters (SPF) is the oldest quarterly survey ofmacroeconomic forecasters in the United States. The survey was begun in 1968 as a jointproject by the Business and Economic Statistics Section of the American StatisticalAssociation (ASA) and the National Bureau of Economic Research (NBER) and wasoriginally known as the ASA-NBER survey. Zarnowitz (1969) describes the originalobjectives of the survey, and Zarnowitz and Braun (1993) provide an assessment of itsachievements over its first twenty-two years. In June 1990 the Federal Reserve Bank ofPhiladelphia, in cooperation with the NBER, assumed responsibility for the survey, at whichtime it became known as the Survey of Professional Forecasters (see Croushore, 1993).

The survey is mailed four times a year, the day after the first release of the NationalIncome and Product Accounts data for the preceding quarter. Most of the questions ask forpoint forecasts, for a range of variables and forecast horizons. In addition, however, densityforecasts are requested for aggregate output and inflation. The output question wasunfortunately switched from nominal to real in the early 1980s, thereby rendering historicalevaluation of the output forecasts more difficult, whereas the inflation question has no suchdefect and provides a more homogeneous sample. Thus we focus on the density forecasts ofinflation. Each forecaster is asked to attach a probability to each of a number of intervals, orbins, in which inflation might fall, in the current year and in the next year. The definition ofinflation is annual, year over year. The probabilities are averaged over respondents, and for

each bin the SPF reports the mean probability that inflation will fall in that bin, in the currentyear and in the next year. The report on the survey results that was previously published inthe NBER Reporter and the American Statistician did not always refer to the density forecasts,and sometimes combined bins, but means for all the bins in the density forecasts have beenincluded in the Philadelphia Fed’s press release since 1990, and the complete results datingfrom 1968 are currently available on their Web page

(http://www.phil.frb.org/econ/spf/spfpage.html). This mean probability distribution istypically viewed as a representative forecaster and is our own focus of attention. The meanforecast was the only one available to analysts and commentators in real time.

There are a number of complications, including:

(a) The number of respondents over which the mean is taken varies over time, with a low of

14 and a high of 65.

(b) The number of bins and their ranges have changed over time. From 1968:4-1981:2 there

were 15 bins, from 1981:3-1991:4 there were 6 bins, and from 1992:1 onward thereare 10 bins.

base year is 1958, from 1976:1 to 1985:4 the base year is 1972, and from 1986:1 to1991:4 the base year is 1982. Beginning in 1992:1, the base year is 1987.(d) The price index used to define inflation in the survey has changed over time. From

1968:4 to 1991:4 the SPF asked about inflation as assessed via the implicit GNPdeflator, and from 1992:1 to 1995:4 it asked about inflation as assessed via the implicitGDP deflator. Presently the SPF asks about inflation as assessed via the chain-5

weighted GDP price index.

(e) The forecast periods to which the SPF questions refer have changed over time. Prior to

1981:3, the SPF asked about inflation only in the current year, whereas it subsequentlyasked about inflation in the current year and the following year. Errors occurred in1985:1, 1986:1 and 1990:1, when the first annual forecast was requested for theprevious year and the second forecast for the current year, as opposed to the currentand the following year.

Most of the complications (e.g., a, b, c and d) are minor and inconsequential. Complication (e), on the other hand, places very real constraints on what can be done with thedata. It is apparent, however, that the series of first-quarter current-year forecasts representsan unbroken sample of annual 3-quarter ahead inflation density forecasts, with non-overlapping innovations. (If the information set consists only of data up to the final quarter ofthe preceding year, then this is a conventional annual series of one-step-ahead forecasts; it islikely, however, that information on the current year available in its first few weeks is alsoused in constructing forecasts.) The sample runs from 1969 to 1996, for a total of 28 annualobservations (densities), which form the basis of our examination of inflation density forecastadequacy.

3. Evaluating Inflation Density Forecasts

We evaluate the forecasts using the methodology proposed by Diebold, Gunther andTay (1998), the essence of which is consideration of the series of probability integraltransforms of realized inflation {yt}t1 with respect to the forecast densities {pt(yt)}t1. That is, we consider the series

{zt}t

281

ytpt(u)du

28t1.

Diebold, Gunther and Tay (1998) show that if the density forecasts are optimal (in a sense that

iid28

they make precise), then {zt}t1U(0,1). The basic idea is to check whether therealizations yt come from the forecast densities pt(yt) by using the standard statistical resultthat, for a random sample from a given density, the probability integral transforms of theobservations with respect to the density are iid U(0,1), extended to allow for potentially time-varying densities. In a forecasting context, independence corresponds to the usual notion ofthe efficient use of an information set, which implies the independence of a sequence of one-step-ahead errors. For our inflation density forecasts, an “error” is an incorrect estimate of theprobability that inflation will fall within a given bin; a correct estimate of the tail areaprobability, for example, implies that we observe the same relative frequency of

correspondingly extreme forecast errors, in the usual sense of the discrepancy between pointforecast and actual outcome for inflation.

Formal tests of density forecast optimality face the difficulty that the relevant null

hypothesis -- iid uniformity of z -- is a joint hypothesis. For example, the classical test of fitbased on Kolmogorov’s Dn-statistic, the maximum absolute difference between the empiricalcumulative density function (c.d.f.) and the hypothetical (uniform) c.d.f., rests on anassumption of random sampling. The test is usually referred to as the Kolmogorov-Smirnovtest, following Smirnov’s tabulation of the limiting distribution of Dn and introduction of one-sided statistics, while other authors have provided finite-sample tables (see Stuart and Ord,1991, §30.37). Little is known, however, about the impact on the distribution of Dn of

departures from independence; thus test outcomes in either direction may be unreliablewhenever the data are not generated by random sampling. More generally the test is notconstructive, in that if rejection occurs, the test itself provides no guidance as to why.

More revealing methods of exploratory data analysis are therefore needed tosupplement formal tests. To assess unconditional uniformity we use the obvious graphicaltools, estimates of the density and c.d.f. We estimate the density with a simple histogram,which allows straightforward imposition of the constraint that z has support on the unitinterval, in contrast to more sophisticated procedures such as kernel density estimates with thestandard kernel functions. To assess whether z is iid, we again use the obvious graphical tool,the correlogram. Because we are interested not only in linear dependence but also in otherforms of nonlinear dependence such as conditional heteroskedasticity, we examine both thez) and the correlogram of (z¯z)2.correlogram of (z¯

It is useful to place confidence intervals on the estimated histogram and correlograms,

in order to help guide the assessment. There are several complications, however. In order toseparate fully the desired U(0,1) and iid properties of z, we would like to construct confidenceintervals for histogram bin heights that condition on uniformity but that are robust todependence of unknown form. Similarly, we would like to construct confidence intervals forthe autocorrelations that condition on independence but that are robust to non-uniformity. Inaddition, the SPF sample size is small, so we would like to use methods tailored to thespecific sample size.

Unfortunately, we know of no asymptotic, let alone finite-sample, method for

constructing serial-correlation-robust confidence intervals for histogram bin heights under the

U(0,1) hypothesis. Thus we compute histogram bin height intervals under the stronger iidU(0,1) assumption, in which case we can also compute the intervals tailored to the exact SPFsample size, by exploiting the binomial structure. For example, for a 5-bin histogram formedfrom 28 observations, the number of observations falling in any bin is distributed binomial(28, 5/28) under the iid U(0,1) hypothesis. (This formulation relates to each individual binheight when the other four bins are combined, and the intervals should not be interpretedjointly.)

To assess significance of the autocorrelations, we construct finite-sample confidenceintervals that condition on independence but that are robust to deviations from uniformity bysampling with replacement from the observed z series and building up the distribution of thesample autocorrelations. The sampling scheme preserves the unconditional distribution of zwhile destroying any serial correlation that might be present.

Two practical issues arise in the construction of the z series. The first concerns thefact that the forecasts are recorded as discrete probability distributions, not continuousdensities, and so we use a piecewise linear approximation to the c.d.f. For example, supposethe forecast probability for y<4 is 0.4 and the forecast probability for 4

y<5 is 0.3. If the

realization of y is 4.6, then we compute z as 0.4+0.6(0.3)=0.58. Further, the two end bins areopen; they give the probabilities of y falling above or below certain levels. When a

realization falls in one of the end bins, to apply the piecewise linear approximation we assumethat the end bins have the same width as all the other bins. This occurs for only threeobservations, and in each case the realized inflation rate is very close to the interior boundaryof the end bin.

The second issue is how to measure realized inflation: whether to use real-time orfinal-revised data, and for which inflation concept. As regards the use of real-time vs. final-revised data, we take the view that forecasters try to forecast the “true” inflation rates, the bestestimates of which are the final revised values. Thus we use the most recently revised valuesas our series for realized inflation. Regarding the inflation concept, we noted earlier that theprice index used to define inflation in the survey has changed over time from the implicitGNP deflator to the implicit GDP deflator to the chain-weighted price index. Accordingly,we measure realized inflation as the final revised value of the inflation concept about whichthe survey respondents were asked. From 1969 to 1991 we use the percent change in theimplicit GNP deflator, from 1992 to 1995 we use the percent change in the implicit GDPdeflator, and for 1996 we use the percent change in the chain-weighted price index.

Two previous studies of the SPF inflation density forecasts merit discussion. Zarnowitz and Lambros (1987) use the survey results to draw the important distinctionbetween uncertainty, as indicated by the spread of the probability distribution of possibleoutcomes, and disagreement, as indicated by the dispersion of respondents’ (point) forecasts: consensus among forecasters need not imply a high degree of confidence about the commonlypredicted outcome. Zarnowitz and Lambros find that the variance of the point forecasts tendsto understate uncertainty as measured by the variance of the density forecasts. The formervaries much more over time than the latter, although the measures of consensus and certainty(or the lack thereof) are positively correlated. Zarnowitz and Lambros also find thatexpectations of higher inflation are associated with greater uncertainty. Throughout theirpaper, however, they summarize the individual density forecasts by their means and standard

deviations prior to averaging over respondents; thus they use only part of the information inthe density forecasts.

McNees and Fine (1996) evaluate the individual inflation density forecasts of a sampleof 34 forecasters who responded to the survey on at least 10 occasions. They proceed bycalculating the implied 50% and 90% prediction intervals, and test whether the actualcoverage -- the proportion of occasions on which the outcome fell within the interval --corresponds to the claimed coverage, 50% or 90% as appropriate, using the binomialdistribution. Again, only part of the information in the density forecasts is used. Moreover,even in the more limited framework of interval forecast evaluation, the McNees-Fineprocedure examines only unconditional coverage, whereas in the presence of dynamics it isimportant to examine conditional coverage, as in Christoffersen (1997). Put differently, in thelanguage of density forecast evaluation, McNees and Fine implicitly assume that z is iid inorder to invoke the binomial distribution; they test only whether z is unconditionally U(0,1).4. Results

We show the basic data on realized inflation and “box-and-whisker” plots representingthe density forecasts in Figure 1. The bottom and top of the box are the 25% and 75% points,the interior line is the median, the bottom whisker is the 10% point, and the top whisker is the90% point. The box-and-whisker plots point to a number of features of the forecasts and theirrelationship to the realizations. First, comparing forecasts and realizations, similar patterns tothose observed by Zarnowitz and Braun (1993, pp. 30-31) in the distribution of individualpoint forecasts for the period 1968:4-1990:1 can be seen: “in 1973-74, a period of supplyshocks and deepening recession, inflation rose sharply and was greatly underestimated ... The

same tendency to underpredict also prevailed in 1976-80, although in somewhat weaker form... In between, during the recovery of 1975-76, inflation decreased markedly and was mostlyoverestimated. Another, much longer disinflation occurred in 1981-85 ... Here again mostforecasters are observed to overpredict inflation ... Finally, in 1986-, inflation ... wasgenerally well predicted ...”, and this has been maintained up to the end of our present sample,when the errors, although persistently of the same sign, are relatively small. There is alsoevidence of adaptation: although inflation is unexpectedly high when it initially turns high,and unexpectedly low when it initially falls, forecasters do eventually catch up.

Second, the data seem to accord with the claim that the level and uncertainty ofinflation are positively correlated, as suggested by Friedman (1977). Although this

hypothesis has typically been verified by relating the variability of inflation to its actual level, in a forecasting context the relevant hypothesis is that expectations of high inflation areassociated with increased uncertainty, and this is verified for a shorter sample of these data byZarnowitz and Lambros (1987), using different techniques, as noted above. In Figure 1 theforecasts for 1975 and 1980 immediately catch the eye, with two of the largest values of theinterdecile range -- the distance between the whiskers -- corresponding to two of the highestmedian forecasts. Overall there is a strongly significant positive association between thesemeasures; the coefficient in a regression of the interdecile range on the median forecast has ap-value of 0.0198 (with allowance made for positive residual autocorrelation, discussedbelow). On the other hand the forecasts for 1986 and 1987 are outliers: these give the twolargest values of the interdecile range, at relatively low median forecasts (and yet lowerrealizations). Perhaps this reflects genuine uncertainty about the impact of the fall in the

world price of oil, or simply indicates sampling problems, because the number of surveyrespondents was falling through the late 1980s, prior to revival of the survey by thePhiladelphia Fed.

Third, there has been a gradual tightening of the forecast densities since the late 1980s,perhaps due to a reduction of perceived likely supply and demand shocks, an increase incentral bank credibility, a reduction in uncertainty associated with the lower level of inflation,or some combination of these. The distributions nevertheless seem to be still too dispersed,because most of the realizations over this period fall squarely in the middle of the forecastdensities.

Next, we compute the z series by integrating the forecast densities up to the realizedinflation rate, period by period, and we plot the result in Figure 2, in which large valuescorrespond to unexpectedly high values of realized inflation, and conversely. Even at thissimple graphical level, deviations of z from iid uniformity are apparent, as z appears seriallycorrelated. In the first half of the sample, for example, z tends to be mostly above its average,whereas in the second half of the sample it appears that the representative forecasteroverestimated the uncertainty of inflation, because most of the values of z cluster around 0.4 ,and they vary little compared to the first half of the sample. This is the counterpart to theobservation in Figure 1 that most of the recent realizations are near the middle of the forecastdensities, a result that diverges from Chatfield (1993) and the literature he cites, which oftenfinds that forecasters are overconfident, in that their interval forecasts are too tight, not toowide.

To proceed more systematically, we examine the distributional and autocorrelation

properties of z. We show the histogram and empirical c.d.f. of z in Figure 3, together withfinite-sample 95% confidence intervals calculated by simulation under the assumption of iiduniformity. The unavoidably wide intervals reflect the small sample size.

The empirical c.d.f. lies within the 95% confidence interval. Kolmogorov’s Dn-statistichas a value of 0.2275, which is less than the 5% critical value of 0.24993 given for thissample size by Miller (1956), although little is known about the impact of departures fromrandomness on the performance of this test, as noted above. In the histogram two bins lieoutside their individual 95% confidence intervals. The chi-square goodness-of-fit statistic hasa value of 10.21, which exceeds the simulated 5% critical value for this sample size of 9.14(the corresponding asymptotic chi-square (4) value is 9.49), although the above caveat againapplies.

Two features of the data stand out in both panels of Figure 3. First, too few

realizations fall in the left tail of the forecast densities to accord with the probability forecasts,resulting in an empirical c.d.f. z that lies substantially below the 45-degree line in the lowerpart of its range, and a significantly small leftmost histogram bin. This reflects the fact thatmany of the inflation surprises in the sample came in the 1970s, when inflation tended to beunexpectedly high; episodes of unexpectedly low inflation are rarer than the surveyrespondents think. Second, the middle histogram bin is significantly too high and theempirical c.d.f. lies above the 45-degree line in this range, both indicating too manyrealizations in the middle of the forecast densities, an already-noted phenomenon drivenprimarily by the events of the late 1980s and 1990s. The observations from the first half ofthe sample are shaded in the histogram and are seen to be more uniformly distributed, except

again for the lowest values, illustrating once more the different characteristics of the two sub-periods.

We show the correlograms of (z

¯z) and(z

¯z)2 in Figure 4, together with finite-

sample 95% confidence intervals for the autocorrelations computed by simulation under theassumption that z is iid but not necessarily U(0,1). The first correlogram clearly indicatesserial correlation in z itself. The first sample autocorrelation, in particular, is large and highlystatistically significant, and most of the remaining sample autocorrelations are positive andsignificant as well. A Ljung-Box test on the first five sample autocorrelations of (z

¯z)

rejects the white noise hypothesis at the 1% level, using simulated finite-sample critical valuescomputed in the same way as for the correlogram confidence intervals.

Several explanations come to mind, one being the possibility that forecasters are moreadaptive than rational, noted above. The inflation series itself is highly persistent, and theforecast densities might not be expected to change rapidly; hence forecasters might use amore-than-optimal amount of extrapolation. Forecast errors are often autocorrelated due toinformation lags: if a forecast for time t+1 made at time t is based on an information set datedt-1, then it is in effect a two-step-ahead forecast and so, even if optimal, its errors will exhibitan MA(1) correlation structure. The present forecasts are made at the beginning of the year,at which time forecasters have data on the previous year, albeit liable to revision. Because theforecast relates to the current year it is close to a genuine one-step-ahead forecast, and theimpact of data revisions is unlikely to be sufficient to cause substantial autocorrelation inforecast errors. An examination of the autocorrelations of z based on preliminary inflationfigures supports the later claim; a Ljung-Box test on the first five sample autocorrelations,

again using simulated critical values, also rejects the iid hypothesis at the 1% level. In anyevent, the autocorrelations at higher lags in Figure 4 are not suggestive of a moving averagestructure. It is not clear precisely what kinds of autocorrelation in z might be expected oncethe density forecasts depart from optimality, but here also there is evidence of too muchpersistence.

It is also possible that serial correlation in z may be due to the departure or inclusionover time of forecasters who tend to be systematically optimistic or pessimistic. There is noway to check whether this is indeed the case without examining the survey returns ofindividual respondents, but the problem is likely to be pertinent only if the number ofrespondents is small. As it turns out, the number of respondents was greater then twenty in allyears but four. Furthermore, Figures 2 and 3 suggest that any systematic inclusion ofoptimistic forecasters would have been in the early years of the sample, but that is the periodwhen the survey enjoyed the greatest number of respondents.

It is interesting to note that although (zevidence of serial correlation in (z

¯z) appears serially correlated, there is little

¯z)2 would suggest that

¯z)2. Serial correlation in (z

the inflation density forecasts tend to miss heteroskedasticity in realized inflation. Hence theserial dependence in z appears to be associated with dynamics in the conditional mean ofinflation neglected by the density forecasts, not with neglected dynamics in the conditionalvariance of inflation.5. Conclusion

Our overall conclusion is that the density forecasts of inflation reported in the Surveyof Professional Forecasters are not optimal -- the probability integral transforms of the

realizations with respect to the forecast densities are non-uniform and autocorrelated. Formalhypothesis tests more clearly support the autocorrelation part of this joint rejection, becausehere our resampling procedures produce tests that are robust to non-uniformity. The impactof this autocorrelation on the behavior of goodness-of-fit tests is not known, and our rejectionof uniformity rests to a greater extent on descriptive methods. In general the density forecastsoverestimate the probability that inflation will fall substantially below the point forecast,because there are too few observations in the left tail of the z density: negative inflationsurprises occur less often than these forecasters expect. In the more recent data this tendencyextends to both tails of the z density, and surprises of either sign occur less often thanexpected. In the 1990s the forecasters were more uncertain than they should have been,perhaps because they did not recognize, at least to a sufficient degree, that expectations oflower inflation are associated with lower uncertainty. This conclusion was alreadydocumented by Zarnowitz and Lambros (1987), and is endorsed here.

We have treated the mean density forecast as a collective forecast, although the sampleover which the mean is taken varies in size and composition over time, and so it would beinteresting to repeat the analysis for individual forecasters. One of the original aims of thesurvey was to keep a comprehensive record of forecasts so that forecast evaluation could beconducted on a “broader, more objective and systematic basis” (Zarnowitz, 1969), and wehave clearly benefitted from the archive that has been accumulated. On the other hand a littlescrutiny reveals the difficulties in extending our analysis to individual forecasters, againbecause the survey’s coverage varies, with high turnover of participants; hence only arelatively short series of forecasts is available for most individuals. The number of forecasts

might be increased by adding the second, third and fourth quarter forecasts and even, in mostof the recent years of the survey, also including forecasts for the following year as well as thecurrent year. However, the pattern of the optimal evolution of density forecasts in suchsituations is not immediately apparent. For point forecasts, tests of the optimality of asequence of fixed-event forecasts are based on the independence of successive forecastrevisions (Clements, 1997), and the counterpart for density forecasts awaits further research. In the meantime, the evaluation methods for a conventional series of density forecastsemployed in the present application are commended for wider use as such series accumulate.

References

Aït-Sahalia, Y. and Lo, A. (1998), “Nonparametric Estimation of State-Price Densities

Implicit in Financial Asset Prices,” Journal of Finance, 53, 499-7.Chatfield, C. (1993), “Calculating Interval Forecasts,” Journal of Business and Economic

Statistics, 11, 121-135.Christoffersen, P.F. (1998), “Evaluating Interval Forecasts,” International Economic Review,

*.Clements, M.P. (1997), “Evaluating the Rationality of Fixed-event Forecasts,” Journal of

Forecasting, 16, 225-239. Croushore, D. (1993), “The Survey of Professional Forecasters,” Business Review, Federal

Reserve Bank of Philadelphia, November/December, 3-15.Diebold, F.X. and Lopez, J.A. (1996), “Forecast Evaluation and Combination,” in G.S.

Maddala and C.R. Rao (eds.), Handbook of Statistics 14: Statistical Methods inFinance, 241-268. Amsterdam: North-Holland.Diebold, F.X., Gunther, T.A. and Tay, A.S. (1998), “Evaluating Density Forecasts, with

Application to Financial Risk Management,” International Economic Review, *.Friedman, M. (1977), “Nobel Lecture: Inflation and Unemployment,” Journal of Political

Economy, 85, 451-472.McNees, S.K. and Fine, L.K. (1996), “Forecast Uncertainty: Can it be Measured?,” Presented

at the Conference on Expectations in Economics, Federal Reserve Bank ofPhiladelphia, October, 1996.Miller, L.H. (1956), “Table of Percentage Points of Kolmogorov Statistics,” Journal of the

American Statistical Association, 51, 111-121.J.P. Morgan (1996) “RiskMetrics --Technical Document,” Fourth Edition, New York.Söderlind, P. and Svensson, L.E.O. (1997), “New Techniques to Extract Market Expectations

from Financial Instruments,” Working Paper 5877, National Bureau of EconomicResearch, Cambridge, Mass.Stuart, A. and Ord, J.K. (1991), Kendall’s Advanced Theory of Statistics, 5th ed., Volume 2.

London: Edward Arnold.

Wallis, K.F. (1995), “Large-Scale Macroeconometric Modeling,” in M.H. Pesaran and M.R.

Wickens (eds), Handbook of Applied Econometrics, 312-355. Oxford: Blackwell.Zarnowitz, V. (1969), “The New ASA-NBER Survey of Forecasts by Economic

Statisticians,” American Statistician, 23, 12-16.

Zarnowitz, V. and Braun, P. (1993), “Twenty-Two Years of the NBER-ASA Quarterly

Economic Outlook Surveys: Aspects and Comparisons of Forecasting Performance,”in J.H. Stock and M.W. Watson (eds), Business Cycles, Indicators, and Forecasting(NBER Studies in Business Cycles, Volume 28), 11-84. Chicago: University ofChicago Press. Zarnowitz, V. and Lambros, L.A. (1987), “Consensus and Uncertainty in Economic

Prediction,” Journal of Political Economy, 95, 591-621.

Figure 1

Inflation Forecasts and Realizations

6970717273747576777879808182838485868780919293949596

Time

Notes: The density forecasts are represented by box-and-whisker plots. The boxes represent the inter-quartile range of the

th percentiles. We represent inflationforecasts, and the inner line represents the median; the tails represent the 10th and 90

realizations with .21

Figure 2

Time Series Plot of z

1.0

0.8

0.6

0.4

0.2

0.0

Time

Figure 3

Histogram and Empirical Cumulative Density Function of z

0.00.0

1.02.00.20.40.60.81.0

0.00.0

0.20.40.60.81.00.20.40.60.8

1.0z

Notes to top panel: Dashed lines represent 95% confidence intervals for individual binheights under the hypothesis that z is iid U(0,1). The shaded region corresponds to the first14 z observations.

Notes to bottom panel: We superimpose on the empirical c.d.f. a U(0,1) c.d.f., together with95% confidence intervals under the hypothesis that z is iid U(0,1). See text for details.

Figure 4

Sample Autocorrelation Functions of (z¯z) and (z¯z)2

Notes: The dashed lines indicate 95% confidence intervals computed under the hypothesis

that {zt}t1iid. See text for details.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文