Start with a translog utility function, which yields the following

Using Engel’s Law to Estimate CPI Bias
Bruce W. Hamilton*
February 3, 2014
Engel’s Law states that food’s budget share is inversely related to household real income. As Hendrik
S. Houthakker (1987) states, “Of all empirical regularities observed in economic data, Engel’s law is
probably the best established….(but) like most economic laws, it holds only ceteris paribus; prices, among
other things, are assumed constant.” This suggests that movements in food’s budget share might serve as an
indicator of movements in real income. If the movement in real income inferred from movements in food’s
budget share is inconsistent with real-income movements measured directly (nominal income deflated by
the Consumer Price Index (CPI)), then, subject to appropriate caveats, we can use Engel’s law and data on
food’s budget share to estimate the degree to which the CPI is biased. As Leonard I. Nakamura (1996)
notes in his critique of the CPI, food’s budget share declined throughout the “stagnant” 1970s, suggesting
that income growth was more robust during the ‘70s than suggested by the CPI figures. 1
Using the Panel Study of Income Dynamics (PSID) I estimate cross-section Engel curves for food for
1974 through 1991. If this demand function is properly specified, if preferences are stable, and if there are
no systematic errors in the variables, there should be no systematic movement in the Engel curve from one
year to another. In fact, my estimated Engel curves drift consistently to the left; as time passes, any given
food-budget share is associated with ever smaller levels of CPI-deflated income. I attribute this Engel curve
drift to CPI bias; real income has grown more than CPI-deflated income because the CPI has overstated
inflation. The true cost of living index is that index which eliminates secular drift in estimated cross-section
Engel curves.
In spirit, this approach is similar to that of William D. Nordhaus (1996), who creates a CPI
adjusted to make real income growth consistent with consumers’ reported perceptions of their financial
well-being. Nordhaus measures consumers’ perceptions of their well-being by taking the difference
between those who report themselves “better off than last year” and those who report themselves “worse
off” in the University of Michigan Survey Research Center’s survey of consumer behavior.
My approach offers several advantages over Nordhaus. First, my dependent variable is continuous
rather than the trichotomous survey variable (“better off than last year,” “worse off,” “about the same”).
Second, actions do speak louder than words; under my approach a consumer must put his money where his
1
mouth is, not just respond to a survey. Third, the Alan B. Krueger/Aaron Siskind (1998) criticisms of
Nordhaus (his failure to account for shifts in the income distribution and life-cycle effects) are very easily
handled with cross-section data and other regressors.
My approach also has some kinship with the extensive work of Dale W. Jorgenson and Daniel T.
Slesnick (1997), who estimate a complete system of demand functions, and then use information on prices
and the demand-system coefficients to back out annual costs and standards of living. They aggregate
individual-good prices into 5 broad-category translog price indices, which constitute the price data for their
demand system. For purposes of estimating CPI bias, the main difficulty with their approach is that they
must assume that the individual-good prices are reported correctly, and that they properly capture the array
of available goods. Whereas their method elegantly handles substitution bias, it has no way of handling
such problems as new-product bias or unmeasured quality improvement.2
Most research on CPI bias takes the brute-force approach of examining the various components of
the index for specific instances of bias, and attempting to determine the magnitude of these biases (see
Michael J. Boskin, Ellen R. Dulberger, Robert J. Gordon, Zvi Griliches and Jorgenson (1996), Brent R.
Moulton (1996), Matthew D. Shapiro and David W. Wilcox (1996)). One problem with this approach is
that it is impossible to know when we are “finished;” the most we can ever say is that we have corrected all
of the biases that we have found. Also for many important adjustments it is difficult to quantify the required
adjustment; much of the discussion, even in a study as careful as the Boskin report, takes the form of (highly
educated) guesstimates.
Whatever its own weaknesses, my estimation method has several virtues to recommend it. First, in
approach it is completely different from the approach of the Boskin commission and related work. If this
approach approximately corroborates the quantitative estimates of other methods, it adds to their strength.
Note in particular that my method makes no direct use of the CPI itself, except that I rely on the CPI
estimate of movement in the relative price of food. I estimate a cost of living index from scratch, without
building it up from the extant CPI estimate.
Second, this method is not labor intensive. Given a database such as the PSID, only a relatively
modest amount of econometric work is required to estimate a true cost of living index over a period of many
years.
2
Third, and related to the second, with this method it is straightforward to estimate movement in a
true cost of living index for any population group for which adequate data exist. Simply by varying sample
selection criteria, it is possible to estimate inflation rates for different races, age groups, geographic areas,
and so on.
Fourth, if the method proves feasible, it might provide an efficient method for estimating the cost
of living in other countries, including developing countries where panel data sets might be superior to the
types of data sets typically underlying price indices.
To get a feel for the usefulness of food as an indicator of well-being, turn to Figures 1 and 2. Data
for both figures are national aggregates from the National Income and Product Accounts (NIPA). 3 Treating
food’s share as an indicator for standard of living, one would conclude that there was a period of stagnation
from the late 1960s through the mid-'70‘s, but that for the 20 years since then real income growth has been
about what it was in the prior two decades.
Figure 2 shows the components of this ratio. Per capita food consumption (food expenditure
deflated by the food component of the CPI)4 rose about 25 percent, almost linearly, from 1950 through
1972. After drifting down, then up and then down again, per-capita food consumption in 1991 only 6.7
percent above its 1972 level, despite a 27 percent rise in per-capita CPI-deflated DPI from 1974 to 1991
and an 8 percent decline in the relative CPI of food. Figure 2 appears to be internally inconsistent, and also
inconsistent with figure 1 (which suggests that real income growth after 1975 may have been comparable to
the growth pattern from 1950 to 1970). Nevertheless, one could imagine a variety of explanations for the
apparent inconsistency between these figures. It could have been caused by an unmeasured decline in the
relative price of food, the widening of the income distribution, the decline in family size, increasing female
labor force participation or a change in the volatility of income, for example.
Figure 1
Food Budget Share
3
0.24
0.22
0.2
0.18
budget share
0.3
0.28
0.26
0.16
0.14
0.12
1994
1990
1986
1982
1978
1974
1970
1966
1962
1958
1954
1950
NIPA
PSID
4
Figure 2
Per Capita Income and Food Expenditure
(1968 = 1)
5
1.6
1.5
1.4
1.3
1.2
1.1
1
0.9
0.8
0.7
0.6
1994
1990
1986
1982
1978
1974
1970
1966
1962
1958
1954
1950
Food/CPI(f)
Income/CPI
6
The first purpose of this paper is to utilize the PSID to see whether the anomalies of figures 1 and 2
can be attributed to some non-CPI cause such as demographics or changes in the distribution of income.
The second purpose is to offer a more refined estimate of CPI bias.
I estimate a demand function for food at home for 1974 through 1991 5, with a set of year dummies
to allow for possible intertemporal movement in these curves. Using a standard measure of real income
(total family income after federal taxes, the PSID’s best continuously available approximation of disposable
income) deflated by the CPI, this demand function has shown consistent drift over the sample period; I
attribute this drift to unmeasured growth in real income, and in turn I attribute the mismeasurement of
income to CPI bias.
In a nutshell, the results are as follows: On average, in 1974 my PSID sample spent 16.64 percent
of its income on at-home food. By 1991 this share had fallen to 12.04 percent. Measured per-household
real income grew 16 percent over this time span, explaining just over 1.5 points of the food share decline.
Decline in the relative CPI of food is sufficient to explain perhaps as much as half a percentage point of
decline in food’s share. Other regressors accounts for less than 0.1 point of additional decline; thus about
2.5 points of the food-share decline are left to be explained by CPI bias. I estimate that this bias is about
2.5 percent per year from 1974 through 1981, and slightly under 1 percent per year since then.
In addition to encapsulating the results, this “nutshell sketch” highlights the greatest potential
weakness of the general approach. By default, I ascribe all movement in food’s share, not assigned to the
regressors by the equation, to CPI bias. Any systematically time-varying omitted variables, or any
specification errors whose effects vary over time, appear spuriously as CPI bias. Fortunately, the PSID
offers a rich array of covariates, and there is a long history of successful fitting of food Engel curves to the
Working/Leser specification. Nevertheless, the general caveat should be kept in mind.
I. ESTIMATION APPROACH
In this section I describe the empirical method that I will use to estimate CPI bias. As stated, the
fundamental approach is to infer well-being by observing food’s share.
A. Why Food?
7
Food is the only consumption item regularly tracked in the PSID, and the PSID is the only micro
data set with good income data which covers a sufficient time span to properly carry out this exercise. But
in fact food was a fortuitous choice; it is perhaps the ideal indicator good for inferring inflation.
First, any indicator good must have an income elasticity substantially different from unity;
otherwise the budget share is insensitive to income and thus to mismeasurement of income.
Second, almost unique among consumer goods, food has no durability whatsoever; expenditure is
virtually identical to consumption. There is no stock/flow issue to confound measurement.
Third, food is a pretty straightforward good. Other candidate indicator goods, such as recreation
(see Dora L.Costa (1997)), involve tricky definitional problems.
Fourth, it is fairly natural to assume, as I will, that food is strongly separable from nonfood in
consumers’ utility functions. With separability comes the result that CPI bias in such prime suspects as
personal computers will not affect food’s budget share through some peculiar (unmodeled) complementarity
or substitutability, but only through the channels anticipated in the model.
Fifth, the Working-Leser/Almost-Ideal-Demand-System (AIDS) functional form has been widely
and successfully used to estimate food demand. Estimation of the true cost of living by my method is quite
straightforward econometrically so long as the dependent variable is food’s budget share and income enters
log-linearly. As we will see below, it is possible but messy to incorporate higher-order terms in income as
regressors. But if the dependent variable is food consumption (or food expenditure) rather than food’s
budget share then the dependent variable itself is infected by the CPI error, and estimation becomes quite
problematic.6
B. Basic Estimating Structure
I begin with the basic demand structure of Working (1943) and Conrad E. V. Leser (1963), i.e., the
single-good demand function which emerges from Angus S. Deaton and John Muellbauer’s (1980) Almost
Ideal Demand System.
8
Equation 1




i , j ,t     ln Pf , j ,t  ln Pn , j ,t   ln Yi , j ,t  ln Pj ,t    x  X i , j ,t i , j ,t
x
where i,j,t is food’s share of family income for family i in Standard Metropolitan Statistical Area (SMSA) j
in year t; Pf,j,t, Pn,j,t, and Pj,t are respectively the true but unobservable price indices of food, nonfood, and all
goods, in SMSA j in year t; Yi,j,t is household i’s nominal income in SMSA j in year t; Xi,j,t is a vector of
background regressors, and i,j,t is the residual.
The true cost of living, Pj,t, is a weighted average of the prices of food and nonfood7:
Equation 2
ln Pj ,t   ln Pf , j ,t  (1   ) ln Pn, j ,t
Pf , Pn , and P are measured with (CPI-bias) error:
Equation 3
 
 



ln Pj ,t  ln Pj ,0  ln 1   j ,t  ln 1   j ,t

where Pj,0 is the unobservable true price level in SMSA j in year 0; j,t is the percent cumulative increase in
the (CPI) measured price (of food, nonfood, or all goods, as indicated by the subscript preceding the j
subscript) from year 0 to t; and Et is the year-t percent cumulative measurement error in the cost-of-living
index since year 0 (again, for food, nonfood, or all goods as indicated).
Now, to save the carrying of log notation, I adopt the following notation: Logs of P and Y
respectively are indicated by lower-case p and y, and logs of (1+) and (1+E), respectively are indicated
by lower-case  and , with subscripts f or n, i,j,t as needed. Making these substitutions,
9
 i , j ,t      f , j ,t   n, j ,t     yi, j ,t   j ,t    X X i , j ,t
Equation 4


X
    f ,t   n,t     t

 

  p f , j ,0  p n, j ,0   p n, j ,0   i , j ,t
where (4) assumes (as I will) that CPI bias does not vary geographically. Now suppose we have a crosssection/time-series database (the PSID) with micro data on income and food expenditure (as well as other
variables such as family composition which would influence food expenditure), as well as cross-section CPI
for all consumption, for food, and for nonfood, over the entire data period for a sample of SMSA’s.
 i , j ,t  ˆ    f , j ,t   n, j ,t     yi , j ,t   j ,t    X X i , j ,t
X
Equation 5
T
   t  Dt    j  D j   i , j ,t
t 1
j
where Dt is a dummy variable equal to 1 in year t, Dj is a dummy equal to 1 for SMSA j, and t and j are
their coefficients. ˆ is the constant term from equations 1 and 4, plus the coefficients respectively of the
omitted time dummy and the omitted SMSA dummy.
Equation (5) is the basic estimating equation of the paper. It is worth noting that the Bureau of
Labor Statistics (BLS)-constructed CPI plays essentially no role in this estimation method. I rely upon BLS
to estimate (movement in) the relative price of food, so as to identify the price coefficient and account for
any change in food’s share that is attributable to relative price changes. But the overall CPI and its
movement over time plays no substantive role whatsoever. Deflation of income by the CPI (the (y-) term
is merely for convenience; that way, the time dummy coefficient maps into CPI bias rather than the true cost
of living. Were I to estimate equation (4) without deflating income, all of the statistical properties of the
equation would remain.
Identification of CPI bias requires estimation of the price coefficient, , which in turn requires
cross-section variation in the price of food. BLS does not report cross-section data on price levels; SMSA
CPI’s (reported for 25 SMSA’s) are measured only relative to same-city CPI’s in the base year. Thus, with
SMSA specific CPI data we can calculate relative (measured) inflation rates across cities. So long as there
10
is sufficient geographic/temporal variation in measured relative inflation rates this may be sufficient to
identify . Interpretation of dummy coefficients is clean and informative:
Equation 6

t

    f , t   n ,t     t
Next assume that the relative bias as between food and nonfood is constant across all years. The parameter
estimates from (5) identify the CPI bias, t, up to the unknown parameter r:
Equation 7
t 
t
 (1  r )
 
1   (1  r )

t

r 1
if
where r is the (unknown) ratio of f to n and  is food-price’s share in the cost of living index. More
generally, the final expression in Equation (7) is approximately correct if either  or (1-r) is close to zero. If
r < 1 as seems plausible (food is less badly biased than nonfood), then (7) understates the bias, increasingly
as r falls increasingly below unity.
I include two measures of income growth, shown in equation (8) below, to explore the rate at
which consumers respond to changes to their income.
y t ,   y t  y t 1
0
Equation 8
y t  y t 1  0
;
y t  y t 1  0
;
otherwise
y t ,   y t  y t 1
0
if
if
otherwise
where yt is income after taxes, deflated by the CPI. I allow the growth coefficients to be different depending
on whether the growth is positive or negative, to address the possibility that consumers respond differently
to a rise than to a fall in income.
II DATA
A. Creation of Observations
Each PSID record reports current food consumption and descriptive variables, and lagged income.
Thus for each observation, income is taken from the same household one wave later, so as to make all of the
variables contemporaneous.
11
B. Selection Criteria: Whose True Cost of Living?
Given sufficient observations, my estimation method can be applied to any population group of
interest. If the cost of living and its growth vary across groups, there is no virtue to aggregating groups so
as to obtain a random sample of the nation’s population. Rather, one should select a group of interest, and
then make further selection cuts based on considerations of data quality. This study concentrates on white
two-adult families (with no selection limits on the number of children).
Among white 2-adult families, I eliminate the poverty sample and families receiving Food Stamps or
Aid to Families with Dependent Children, families for whom after-tax income was less than $150 or topcoded, families for whom taxes were not reported, families for which husband or wife was less than 21
years old, reported a change in composition during the past year, or which reported spending less than 2
percent, or more than 80 percent of income on food.
C. 25-SMSA and National Samples
I work with two samples. The first is the entire national data set for the PSID, subject to the
restrictions noted above. The virtue of this data set (relative to the second one) is size; there are
approximately 1800 valid observations per year. The second data set includes only observations from
SMSA’s for which BLS reports local Consumer Price Indices. Table 1 gives summary statistics for the
1974 and 1991 components of the sample respectively, for both samples. More detailed summary
statistics, including statistics for intervening years, are reported in the appendix
(www.econ.jhu.edu/People/Hamilton).
12
Table 1
Summary Statistics of Full and 25-SMSA Sample
1974 and 1991
Year
FULL
1974
SAMPLE
1991
25-SMSA
1974
SAMPLE
1991
Number
1450
1901
409
403
CPI-deflated After-tax
10411
12120
12008
13784
Income (1968 = 1)
# Children
1.15
1.03
1.05
0.92
County Unemployment
5.51
5.57
6.43
5.59
Husband’s Hrs Work
1984
1902
1957
1863
Wife’s Hrs Work
877
1150
855
1044
Husband’s Education
12.27
13.21
13.07
13.83
Wife’s Education
12.00
13.04
12.27
13.43
Husband’s Age
44.83
45.09
44.10
45.69
Wife’s Age
42.01
42.80
41.68
43.28
Fd Share @ Home
0.1664
0.1204
0.1572
0.1196
Fd Share Restaurant
0.025
0.031
0.027
0.030
# SMSA’s
23
25
Food Price (std deviation)
1.10 (.015)
0.972 (.039)
13
III RESULTS
A. Geographic Price Variation
Tables 2 and 3 gives the regression results based on the 25-SMSA sample with cross-section price
data. Below I discuss the interpretation of the time dummies, and the income and price coefficients (which
are necessary to map the time dummies into estimates of CPI bias). Discussion of other background
coefficients is postponed until the end of the empirical section.
The income and price coefficients imply approximate income and price elasticities of respectively
+0.33 and -0.65. Not surprisingly, given the very small variation in measured relative price of food, the
price coefficient is not estimated precisely. And the implied price elasticity seems high. To some readers,
the income elasticity looks low; it is much lower, for example, than that estimated by James Tobin (1950),
though it is comparable to those obtained by Deaton and Muellbauer (1980). Three points are in order:
First, in that specification below in which I report results based on a 3-year average of income, the elasticity
rises to about 0.4. Second, my estimated elasticity is for food at home, which is demanded more
inelastically than total food.8 Third, the Tobin estimate cited above is based on a population with much
lower income than my PSID sample. If  < 0 (and constant) the elasticity falls as income rises. 9
14
Table 2
Background Coefficients
25-SMSA Sample
(standard errors in parentheses)
Variable
Coefficient
Standard Error
Constant
0.980
(0.0327)
Age of Head
0.0008
(0.00016)
Age of Wife
0.0002
(0.00015)
# of Children
0.0212
(0.00066)
County Unemp
-0.00002
(0.00034)
Hd ann hrs x103
0.0064
(0.00094)
Wf ann hrs x 103
-0.0013
(0.00078)
Head education
-0.00089
(0.00032)
Wife education
-0.0011
(0.00036)
ln Income
-0.101
(0.00188)
ln Rel Food Price
0.0369
(0.0253)
Income Growth +
0.0022
(0.00188)
Income Growth –
-0.025
(0.00269)
Fdshare-rest
0.063
(0.01228)
Standard Error of Estimate
0.061
Adj R2
0.538
15
Table 3: Time and SMSA Coefficients
Geographic Sample
(standard errors in parentheses)
YEAR
Coefficient
Cumul. Bias
SMSA
Coefficient
1975
(std error)
-0.0055 (0.0037)
estimate
0.053
New York
0.0293
(0.0044)
1976
-0.0086 (0.0037)
0.082
Miami
0.0215
(0.0055)
1977
-0.0097 (0.0036)
0.092
Los Angeles
0.0183
(0.0047)
1978
-0.0142 (0.0036)
0.131
Buffalo
0.0182
(0.0135)
1979
-0.0169 (0.0036)
0.154
San Francisco
0.0178
(0.0092)
1980
-0.0218 (0.0040)
0.194
Portland, OR
0.0173
(0.0107)
1981
-0.0245 (0.0040)
0.215
Chicago
0.0134
(0.0056)
1982
-0.0241 (0.0047)
0.212
Cincinnati
0.0130
(0.0102)
1983
-0.0274 (0.0050)
0.238
Houston
0.0117
(0.0066)
1984
-0.0277 (0.0049)
0.240
San Diego
0.0094
(0.0136)
1985
-0.0279 (0.0052)
0.241
Washington, D. C.
0.0083
(0.0074)
1986
-0.0295 (0.0048)
0.253
Philadelphia
0.0080
(0.0080)
1987
-0.0291 (0.0049)
0.250
Milwaukee
0.0072
(0.0329)
1988
Detroit
0.0069
(0.0066)
1989
Boston
0.0063
(0.0076)
1990
-0.0303 (.0049)
0.259
Baltimore
0.0056
(0.0064)
1991
-0.0312 (.0049)
0.266
Denver
0.0055
(0.0024)
Pittsburgh
0.0032
(0.0072)
Kansas City
0.0019
(0.0040)
Seattle
0.0006
(0.0008)
Atlanta
0
(omitted)
St. Louis
-0.0020 (0.0024)
16
Cleveland
-0.0041 (0.0013)
Dallas
-0.0110 (0.0062)
Minneapolis St. Paul
-0.0110 (0.0004)
17
The solid line in figure 3 gives point estimates for cumulative bias, from 1975 through 1991,
assuming the price coefficient takes on its estimated value of 0.0369. In most but not all cases the
difference between successive-year dummy coefficients is significant. Generally two-year differences are
significant. The average annual bias from 1974 through 1981 is 3.54 percent; for the period 1982-1991,
average annual bias is 0.67 percent.
If the true value of  is higher than the estimated value of 0.0369, then part of the decline in food’s
share from 1975 through 1985 which we attribute to CPI bias is really due to the decline in the measured
price of food, and our estimate of CPI bias for this period is biased upward. By contrast, our estimate of
bias since 1985 is biased downward. The second series in Figure 3 shows cumulative bias under the
alternative assumption that  = 0.07 (price elasticity of approximately 0.45). As can be seen, the effect is
quite modest, and the overall story is essentially unchanged: annual bias is around 3 percent from 1975
through 1981 and just under 1percent since.
18
Figure 3
Cumulative Bias Estimate
Alternative Price Coefficient () Values
19
20
B. The National Sample: No Price-Variation Data
The limited size of the geographic sample, dictated by the limited coverage of local CPI deflators,
precludes several important lines of inquiry regarding Engel curves and their implications for CPI bias.
Accounting for permanent income (with income lagged three years as one of the instruments) and testing
alternative functional forms both require more observations. Furthermore, if this technique is to be applied
to less numerous population subgroups (blacks or retirees for example) it is not practical to be restricted to
the 25-SMSA subsample.
To explore these questions I use the entire PSID sample (subject to the criteria described above),
and estimate equation (5) and appropriate variants, omitting the relative price term and the SMSA
dummies. I refer to equation (5) but without the price term or the SMSA dummies as equation (5a).
The form without the food price or the SMSA dummies is misspecified. A simple if inelegant
specification test bias is to run equation (5a) over the geographic sample, and to compare the results with
those of equation (5). When I do this, the adjusted R2 falls from 0.538 to 0.501. Both the background and
time-dummy coefficients (corrected for omission of time-series variation in the relative measured price of
food, see below) are very close to the equation-(5) specifications.
By omitting the food-price term from (5a) , we eliminate not only cross-section variation in the
relative price of food, but time-series variation as well. The year dummy coefficients pick up not only the
CPI bias of equation (6) but also the effect of intertemporal variation in the relative (CPI-measured, but
omitted because of perfect correlation with the year dummies) price of food. Proper interpretation of the
year dummy coefficient is given by equation (9). CPI bias is found by subtracting (f,t - n,t) from the
dummy coefficient, and then dividing by  (assuming, as discussed above, that (1-r) is close to zero).
Equation 9
t  
 t    f ,t   n,t 

The results appear in Table 4. The first column reports results for all households in the sample.
Aside from omission of the relative price term and the SMSA dummies (and the expanded sample size) this
is identical to the regression of Tables 2 and 3. In the next two columns I run the same specification
separately over owners and renters. Inasmuch as my income variable makes no allowance for imputed
21
income from owner-occupied housing, one could argue that my findings are biased. In particular, if such
imputed income rose over time, then it is possible that failure to account for this source of income, rather
than CPI bias, accounts for my findings
In the final column current income is replaced by the household’s average income over the past
three years. The latter variable is a better approximation to permanent income.
First, note the estimates based respectively on current and 3-year average (CPI-deflated) income.
Most of the background coefficients show little change. The estimated dummy coefficients are somewhat
larger and the estimated CPI biases are also a bit higher. (Cumulative estimated CPI bias from 1974
through 1991 is 19 percent greater under the 3-year-average than under the current-income specification.)
The fit is somewhat worse under the 3-year-average specification.
22
Table 4
Regression Coefficients
Equation 7a: National Sample
(Standard errors in parentheses)
CUR. INCOME
OWNERS
RENTERS
3 YR AVG INC
ALL OBSN’S
VARIABLE
COEFFICIENT
COEFFICIENT
COEFFICIENT
COEFFICIENT
Constant
0.997
0.912
1.059 (0.021)
0.906
Age of Head
0.0005 (0.0007)
0.0005 (0.0001)
.0007 (0.0002)
0.00023 (0.00004)
Age of Wife
0.0002 (0.00008)
0.0002 (0.00009)
.0003 (0.00026)
0.00068 (0.00003)
# of Children
0.020
0.019
(0.0004)
.0226 (0.0009)
0.022
County Unemp
0.0004 (0.00013)
0.0003 (0.0001)
.0006 (0.0004)
0.00051 (0.0001)
Head ann hrs x103
0.0052 (0.00049)
0.0044 (0.0005)
.0069 (0.0013)
0.003
Wife ann hrs x103
-0.0017
-0.002
-.002
(0.0011)
-0.0035 (0.0011)
Head education
-0.00004 (0.00015)
-0.00007 (0.0002)
-.001
(0.0004)
-0.0014 (0.0001)
Wife education
-0.0013 (0.0002)
-0.0012
-.0013 (0.0005)
0.00084 (0.0001)
Ln Income
-0.097 (0.00095)
-0.0875 (0.0009)
-.104
(0.0025)
-0.089
(0.0008)
Inc Growth +
0.0014 (0.00088)
0.0021
(0.0010)
.001
(0.0023)
-0.026
(0.0016)
Inc Growth –
-0.056
-0.054
(0.0021)
-.060
(0.0049)
-0.125
(0.0019)
Fdshare-out
0.067
0.089
(0.01373)
.056
(0.031)
0.043
1975
-0.00561 (0.0022)
-0.0064 (0.0023)
-.0004 (0.0057)
-0.00404 (0.0021)
1976
-0.00668 (0.0022)
-0.0083 (0.0023)
-.0004 (0.0057)
-0.00802 (0.0021)
1977
-0.00951 (0.0021)
-0.012
-.0031 (0.0053)
-0.0121
(0.0021)
1978
-0.0118
(0.0021)
-0.0124 (0.0022)
-.012
(0.0054)
-0.0137
(0.0021)
1979
-0.0131
(0.0021)
-0.013
-.018
(0.0055)
-0.0156
(0.0021)
(0.008)
(0.0004)
(0.0004)
(0.002)
(0.0127)
(0.0080)
(0.0004)
(0.0002)
(0.0022)
(0.0022)
(0.0070)
(0.0004)
(0.0006)
(0.0104)
23
1980
-0.0210 (0.0021)
-0.0216 (0.0022)
-.022
(0.0056)
-0.0235
(0.0021)
1981
-0.0217 (0.0022)
-0.0235 (0.0022)
-.014
(0.0058)
-0.0235
(0.0021)
1982
-0.0260 (0.0022)
-0.0256 (0.0023)
-.0289 (0.0059)
-0.0297
(0.0022)
1983
-0.0246 (0.0022)
-0.0249 (0.0023)
-.0259 (0.0058)
-0.0289
(0.0022)
1984
-0.0291 (0.0021)
-0.0300 (0.0022)
-.0269 (0.0056)
-0.0345
(0.0021)
1985
-0.0297 (0.0022)
-0.0285 (0.0022)
-.0377 (0.0057)
-0.0334
(0.0021)
1986
-0.0306 (0.0021)
-0.0297 (0.0022)
-.0340 (0.0055)
-0.0327
(0.0021)
1987
-0.0316 (0.0021)
-0.0306 (0.0021)
-.0373 (0.0053)
-0.0359
(0.0020)
1990
-0.0356 (0.0020)
-0.0347 (0.0022)
-.0396 (0.0050)
-0.0380
(0.0020)
1991
-0.0374 (0.0020)
-0.0367 (0.0021)
-.0402 (0.0053)
-0.0409
(0.0020)
Standard Error
0.058
0.060
0.057
0.060
0.504
0.502
0.539
0.475
of Estimate
Adj R2
24
Figure 4 shows point estimates of annual cumulative bias respectively for the whole sample, for
renters and for homeowners. Overall, the renter and homeowner series are in very close agreement. This
strongly suggests that my estimated CPI bias is not driven by errors in variables associated with failure to
incorporate the imputed rental income of homeowners.
The pattern of estimated CPI bias is quite similar to that reported with the 25-SMSA sample. When
using current income as the regressor I estimate that annual CPI bias was 2.75 percent from 1974 through
1981 and 1.51 percent from 1981 to 1991. When using a three-year average, the analogous annual bias
estimates are 2.9 percent from 1974 to 1981 and 1.72 percent from 1981 to 1991.
25
Figure 4
Cumulative Bias Estimates
Owners, Renters, All
26
0.3
0.25
bias
0.2
all
0.15
Owner
Renter
0.1
0.05
0
1990
1987
1984
1981
1978
1975
27
IV. BACKGROUND COEFFICIENTS
In the above regressions, I have included a series of background regressors, some of which are of
considerable interest in their own right. I restrict my discussion to their values in the final regression.
A. Dynamics
Table 5 below illustrates the dynamics of food consumption as income changes. First, I calculate
food consumption (in dollars) at new steady-state income = $11,000 (in 1968 dollars) and at the sample
means of all of the other regressors, setting both income-growth variables equal to zero (this gives steadystate value of f, which I solve for food consumption). Then I simulate respectively a 10 percent rise in
income (to the new steady state) and a 10 percent decline in income (to the same new steady state).
Reading down Columns 1 and 2, income was originally $10,000 and food consumption was $1354.
In row 2 income jumps 10 percent (to $11,000); food consumption immediately rises to $1395. Row 3
reveals that the new steady state food consumption is $1395. When income rises consumers immediately
jump to the new steady state, apparently regarding the income increase as permanent. By contrast, when
income falls in Columns 3 and 4 (from $12,220 to $11,000) white consumers make no adjustment in the
first year; they appear to regard a fall in income as completely transitory.
28
Table 5
Food-Consumption Response
to 10 Percent Income Change
INCOME
RISES
INCOME
FALLS
1
2
3
4
Food
Income
Food
Income
Old Steady State
1354
10000
1441
12220
Transition
1395
11000
1440
11000
New Steady State
1395
11000
1395
11000
29
B.
Eating Out
If home-cooked and restaurant meals are perfect substitutes then the dependent variable should be the
budget share of total food; otherwise the relative price of restaurant meals should enter on the right-hand
side. Neither restaurant food expenditure nor its price seems to be available in high-quality form. The
former is reported by the PSID but only in fairly broad bands and there are many zero entries, and the latter
has the same problems as the food CPI discussed above. I have approached the problem in two ways.
First, despite data-quality concerns I ran the regressions of Table 4 above with total food’s share as the
dependent variable. None of the point estimates was materially affected, but the fit was considerably worse.
I do not report these results. Second, as can be seen, I added the budget share of food away from home as a
right-hand side regressor. Inclusion has no material effect on other coefficients. The coefficient is always
small and positive (!), and significant.
It is implausible that home-cooked and restaurant meals are complements as implied by the coefficient;
perhaps the restaurant variable is picking up a fixed gourmand effect. In any event the effect is very small; a
1-point rise in restaurant-food’s share increases home-food’s share by 0.06 points.
V. ROBUSTNESS CHECKS
I have subjected the results presented above to numerous robustness checks, all of which support
the broad pattern of findings reported here. The robustness checks fall into three categories. First, I have
included other regressors. Second, I have explored alternative functional forms, including a quadratic term
in income, replacing food’s share with its natural logarithm, and interacting income with various of the
other regressors. Third, I have used instrumental variables to attempt to get a better measure of permanent
income. In each such case, the Engel curves drift at essentially the same rate as reported here, and with the
same significance level. As these checks have no material bearing on the fundamental findings of this
paper, I do not present them here. A detailed description of this work is available in the appendix to this
paper, which can be found at www.econ.jhu.edu/People/Hamilton.
VI. CONCLUSION
In this paper I estimate cross-section food Engel curves for white households in the PSID, from
1974 through 1991. If the Engel curves are properly specified, and if there are no systematic errors in the
data, the coefficients of these curves should not move from year to year. But in fact the Engel curves drift
30
consistently to the left from year to year. The observed decline in food’s budget share is far too large to be
explained by intertemporal movement in CPI-deflated income, or in any of the other included regressors. I
am able to reconcile the cross-section income coefficient with time-series movement in food’s budget share
by invoking annual CPI bias of approximately 3 percent per year from 1974 through 1981, and of
approximately one percent per year from 1981 through 1991
The indirect method of estimating CPI bias, developed in this paper, is both a strength and a
weakness. The weakness is that the entire unexplained movement of the Engel curve is attributed to CPI
bias. If other forces have shifted the Engel curve, then some of the attribution to CPI bias is spurious.
Possibilities include taste shifts, systematic (time-related) errors in other variables, omitted variables and
undiscovered errors in functional form. The robustness of the results to the various specification tests
described both in the body of the paper and the appendix gives some comfort on most of these issues.
The strength of the approach is that the true cost of living is inferred directly from consumers’
behavior. From 1974 through 1991, consumers have behaved (based on their food budgets) as if they have
been getting richer at a rate substantially faster than the growth of CPI-deflated disposable income.
31
References
Anderson, Heather, and Farshid Vahid. “On the Correspondence between Individual and Aggregate Food
Consumption Functions: Evidence from the USA and the Netherlands.” Journal of Applied Econometrics,
September/October 1997, Vol. 12, pp. 477-507.
Banks, James, Richard Blundell and Arthur Lewbel. “Quadratic Engel Curves and Consumer Demand.”
Review of Economics and Statistics, November 1997, Vol. 79, pp. 527-539.
Boskin, Micheal J, Ellen R. Dulberger, Robert J. Gordon, Zvi Griliches, Dale Jorgenson. “Final Report of
the Advisory Commission to Study the Consumer Price Index.” Washington, D. C.: United States Senate
Committee on Finance, U. S. Government Printing Office, 1996.
Braithwait, Steven. “The Substitution Bias of the Laspeyres Price Index.” American Economic Review,
March 1960, Vol. 70, pp. 64-77.
Costa, Dora L. “Less of a Luxury: The Rise of Recreation since 1888,” National Bureau of Economic
Research Working Paper #6054, 1977.
Deaton, Angus, and John Muellbauer “An Almost Ideal Demand System.” American Economic Review,
Jume 1980, Vol. 70, pp. 312-326.
Houthakker, H. S. “Engel’s Law.” John Eatwell, Murray Milgate and Peter Newman, The New Palgrave
Dictionary of Economics, Vol. 2. London: Macmillan Press, 1987, pp. 143-144.
Jorgenson, Dale, and D. T. Slesnick (1997) “Individual and Social Cost of Living Indexes,” Dale
Jorgenson, Welfare, Volume 2: Measuring Social Welfare. Cambridge, MA: MIT Press, 1997, pp. 39-98.
32
Krueger, Alan B., and Aaron Siskind. “Assessing Bias in the Consumer Price Index from Survey Data.”
Working Paper #392, Industrial Relations Section, Princeton University, 1997.
Leser, Conrad E. V. “Forms of Engel Functions.” Econometrica, October 1963, Vol. 31, No. 4, pp. 694703.
Moulton, Brent. “Bias in the Consumer Price Index: What is the Evidence?” Journal of Economic
Perspectives, Fall 1996, Vol. 10 no. 4, pp. pp. 159-77.
Nakamura, Leonard. “Is U. S. Economic Performance Really That Bad?” Working Paper # 95-21/R,
Federal Reserve Bank of Philadelphia, 1996.
Nordhaus, William. “Quality Changes in Price Indexes.” Journal of Economic Perspectives, Winter 1998,
Vol. 12, no. 1, pp. 59-68.
Shapiro, Matthew, and David Wilcox. “Mismeasurement in the Consumer Price Index: An Evaluation.”
Ben Bernanke and Julio Rotemberg, NBER Macroeconomics Annual Cambridge, MA: MIT Press, 1996,
pp. 93-142.
Tobin, James. “A Statistical Demand Function for Food in the U. S. A.” Journal of the Royal Statistical
Society, 1950, Vol 113, Part II, pp. 113-141.
Working, Holbrook. “Statistical Laws of Family Expenditure.” Journal of the American Statistical
Association, March 1943, Vol. 38, pp. 43-56.
33
*The Johns Hopkins University, Baltimore, MD 21218, and the Park School. I thank Carl Christ, Karen
Dynan, Jennifer Hunt, Larry Ball, Robert Moffitt, Matthew Shapiro for helpful comments on prior drafts.
1
Using aggregate data, Nakamura estimates CPI bias in the 1970s relative to its bias in prior decades. In
the absence of cross-section data, he is unable to estimate the absolute level of bias, or to check whether his
bias estimate is sensitive to such other variables as changes in female labor force participation.(And indeed,
estimation of bias is of secondary importance to Nakamura.)
2
Jorgenson and Slesnik is similar in approach to Steven D. Braithwait (1980), though more extensive and
detailed.
3
Figure 1 also shows the average value of food’s budget share (at home plus away from home) for the PSID
sample upon which I base my estimation below. Whereas my estimates are based on food at home, here I
show food at home and away from home to show the comparability with the NIPA numbers. Note the
anomalous appearance of the PSID data for 1969-72.
4
Throughout this paper, when referring to “the CPI,” I will mean the CPI-U and its components.
5
The time span is dictated by the data. Food’s budget share in the PSID is erratic and inconsistent with
National Income Accounts data for 1968-72 (see figure 1), and not reported for 1973, 1988, or 1989. When
I began this research the 1992 wave (1991 income data) was the last one available. In addition, after the
1991 wave, PSID stopped reporting federal income tax liability.
6
James Banks, Richard W. Blundel and Arthur Lewbell (1997) fit the Working/Leser/AIDS form, with
higher order terms in income, for several goods. For most goods the quadratic in income is significant, but
for food they do not reject the linear form.
7
Equation 2 cannot be exactly right; it is the cost of living index for the Cobb-Douglas utility function, for
which food’s budget share is constant. However, the specific form of (2) plays no role in except in equation
(5) below. In particular, so long as the CPI bias in food and nonfood are similar, the specification error in
(2) has no bearing on the analysis.
8
If 20% of food is at restaurants (about right for 1991), and restaurant-food demand is unit elastic, the
overall elasticity of demand for food rises from 0.4 to 0.52 (or from 0.33 to 0.39).
34
9
This is the pattern found by Heather M. Anderson and Farshid Vahid (1997), who revisit Tobin’s analysis.
Using his (log-linear) demand function, they estimate that the cross-section income elasticity fell from .59 in
1941 to .38 in 1972.
35