Standard Errors for Indexes From Complex Samples Author(s): Leslie Kish Source: Journal of the American Statistical Association, Vol. 63, No. 322 (Jun., 1968), pp. 512529 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2284022 . Accessed: 25/07/2014 11:52 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS FOR INDEXES FROM COMPLEX SAMPLES* LESLIE KISH University of Michigan Methods of computingstandarderrorswere developedand applied for several statisticsof importancein economic and social surveys. These statisticsare based on ratiomeansr = y/x;x is oftenthe variable sample size of complex samples. Some statistics involve weighted sums 22wjrj;othersconcernthe relativesRt=r1/ro= (yi/x,)/(yo/xo), the ratio of current(1) to base (0) mean. From theserelativesthe indexes II =- J i Ri are constructed, and changes,12-11, in the index.We developcomputing formuilas and helpfulapproximationsfor standard errorsof these statistics,and apply themto a seriesof data fromour Center'sSurveys & Expectations. A largegroupof empiricalresults of ConsumerAttitudes yield evidence on the behaviorof these statistics,with wide implications for the design of surveys to measure economic and social indicators. 1. STANDARD ERRORS FOR MEANS AND THEIR CHANGES THE use of sample surveysforcollectingsocial and economicdata is increasing in scope. Important indicatorsare more and more based on sample surveys.Many of these resemblein the essentialsof sample designthe Surveys of ConsumerExpectations [Katona and Mueller, 1957, Katona 1964, Mueller 1963] that provided both stimulationand data for our investigations.The directorsof the surveydesignedthe methodsand scales used forcollectingthe data. They also designedthe formsofthe indicatorsderivedfromthe data: the scores (means), the relatives,and the indexes. The methodsof measurement, estimation,and indexformationare not the subject of this paper; they are the subjectofcontinuingand livelydiscussionin economicand statisticaljournals.1 * Supportedby grant GS-777 for "AnalyticalStatisticsfor Complex Samples" fromthe National Science Foundation,and presentedat the 1962 Annual MeetingsoftheAmericanStatisticalAssociation.I am gratefulfor the collaborationof R. K. Pillai who took completechargeof the mass of complexcomputations;also forhelpful suggestionsfromW. Scott Maynes and fromone of the editors. I ProfessorKatona kindlyadded theseremarks: Surveyquestionson buyingintentionsand on consumerattitudesor expectationshave been asked by the SurveyResearchCenterof the UniversityofMichigansince 1946. Studiesofdecisionmakingand of psychological millionsof conantecedentsof overtbehavior (e.g., purchases)were called forbecause, withincreasingaffluence, sumeracquired discretionof action and thereforetheirmajor expendituresbecame dependentnot only on their abilityto buy, but also on theirwillingnessto buy. The purposeof thesestudiesis predictionof futuretrendsof purchases,as well as understandingof the factorsresponsibleforchangein the rate of purchases [Katona, 1964, and earlierpublicationsquoted there]. The emphasisin the Center's studies shiftedfrombuyingintentionsto attitudesand expectations,partly because of considerationsof sampling errorsand samplingvariabilitypresentedin this paper, and partly beprocess: changesin optimismor pessimism,conficause ofa searchforan earlyinterceptof the decision-making ofspecificbuyingplans.The Center'sIndex denceoruncertainty, etc.,are assumedto occurpriorto theformulation of ConsumerSentimentis available since 1952. Since 1963 it is constructedon a quarterlybasis fromitems1 to 5 inclusive(see Table 1, above). Items 6 to 15, and manyotherquestions,are continuedbecause ofspecificproblems on whichtheyshed light. The predictivevalue of the Index was studied by means of regressionequations in which,forinstance,the CommerceDepartmentseriesof durablegoods purchases,or the numberof cars sold, or incurrenceof installment debt, served as dependentvariables. As independentvariables the Index (or change in the Index) was used in willingness conjunctionwithseveralothervariables.The simplestregressionequation,withthe Index representing abilityto buy as independentvariables-both measured6 to 9 monthsearlierthan to buy and incomerepresenting 512 This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS FOR INDEXES FROM COMPLEX 513 SAMPLES Our firstaim was empirical:to learn about the magnitudesand sources of the these importanteconomicindicators.Our second samplingvariationsaffecting aim is to presentmethodsand resultsthat should prove usefulforsimilarindicators in othercomplexsamples, which are not uncommon(thoughpresentation of theirsamplingerrorsis). The surveysare based on complexmultistagesamples,fromwhichare computed means r, each the ratio of two randomvariablesy and x: r = y/x = h H H H Yh /E Xh = h E ha ah E Yha/ H ah E EXha. h a (1) Typically y is the sum of some characteristicin the sample and an estimateof the population value Y, except fora constantfactor-such as f, the sampling fraction.The random variable x in the denominatoris here (and often) the sample size; in generalit estimatesthe populationvalue X in a formsimilarto the numerator. When y/xis merelythe sample mean it could be writtensimplyy=y/n; the more complex notation conformsto methods for computingvariances, describedin section5, whichapproximatethe complexitiesof the sample design. The sample is selectedin H primarystrata and ah denotesnumberof replicate fromthe h-thstratum.The quantitiesYh,' and xh,represent (primary)selections sample totalsforthe a-th primaryselectionin the h-thstratum;Yha' is the total scoreforan attitudeand Xh,a the sample size; Yh and Xh are stratumtotals. These formulascan be, and are, used generallyfor combinedratio estimators;the values Yhaf and Xha are understoodto include any needed unequal weighting. [Kish and Hess, 1959]. Data fromsix surveysare represented,each ofabout n = 1370 interviews,and two fromeach of threeyears: the base year 1956, and two current(at the time of this research)years, 1959 and 1960. Essentiallythe same methodsof multistage probabilityarea samplingwere used in each survey to select dwellings with equal probabilities.In dwellingsfamilieswere identified(1.04 per dwelling), and a singleinterviewwas takenfromhead or wife,alternatelydesignated (or fromhead, ifwithouta spouse). About 1320 dwellingscame fromabout 400 segmentsand these from 66 primarysamplingareas, eithersinglecounties or Standard MetropolitanStatisticalAreas. These are widespreadsamples,withan average of 20 cases coming froman average of seven segmentsper area. Dwellings and segmentsare changed betweensurveys,but primaryareas are not. Of the 66 areas 54 were selectedwithless than certaintyand thesecontribute27 strata,each containing two primaryselections,to the computationof the variance. The other 12 are the largestmetropolitanareas, includedwithcertainty;heresegmentswerethe primarysamplingunitsand these were paired inito18 computingstrata. Altogetherthenthe computationofthe variance is based on 27 + 18 = 45= H strata, withtwo primaryselectionsin each [Kish and Hess, 1959 and Kish 1965, 7.3]. The 15 variatesofTable 1 representanswersobtainedin essentiallythe same formover the years; the scores in column 2 stand for the means of 6 scores obtained on 6 surveys.A score denotesthe ratio mean of scales of attitudesof on durables-yieldedan R2 of.91 fortheperiod1952-66 [Mueller,1963,Katona 1967,and Maynes theexpenditures 1968]. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions 514 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1968 TABLE 1. STANDARD ERRORS FOR SCORES (r) AND THEIR CHANGES (r2-ri) 17Surveys Item Scores se(r) 100 se(r)/r (1) (2) (3) (4) (5) (6) 2.32 2.11 2.95 6.26 1.73 .10) 2.29 ( .31) 2.35 ( .35) 2.74 ( .31) 2.37 ( .33) 2.76 ( .25) 1.68 ( .35) 1.35 2.37 3.78 1.47 2.91 12.93 1.93 2.54 10.83 2.12 3.10 8.50 2.58 3.15 9.68 2.53 3.72 27.37 1.40 2.26 11.96 9.93 1.92 1.76 5.67 2.42 2.94 6.48 3.70 4.25 10.57 2.06 1.60 8.78 1.61 1.29 9.75 1.64 1.55 8.12 1.92 1.69 1. Evaluationoffinancialsituationas compared witha year earlier 2. Expected changein financialsituation 3. Businessconditionexpectedoverthenext12 12 months 4. Businessconditionexpectedforthenextfive years 5. Good or bad time to buy large household goods 6. Changingpricesexpectedfornextyear is to the good or bad 7. Evaluation of currentbusiness conditions comparedto thosea year ago 8. Expectedbusinessconditiona yearfromnow as comparedwiththe present 9. Plan to buy house duringthe nextyear 110 128 156 122 129 92 109 120 14 10. Intentionto buy automobileduringthenext 12 months 11. Evaluation of chancesofhome repair 33 42 12. Evaluation ofchancesofbuyingrefrigerator 14 13. Evaluation of chancesofbuyingT.V. 14 14. Evaluation of chances of buying cooking range 15. Evaluation of chances of buying washing machine 12 17 ( .14) 1.39 .27) 1.87 ( .20) 2.72 .32) 1.48 .13) 1.23 .22) 1.17 ( .17) 1.38 ( .18) se(r2-ri) in individualsof a survey sample. All 15 variates, as they appear here, denote trichotomies.For example, on the firstitem the response "Better off"has a value of200, the negativeresponse"Worse off" has a value of0, and the neutral responsehas a value of 100; the neutral responsesare mostly "Same," with fewer"Uncertain" and "Don't know," and very few "Unascertained." The score 110 expressesresultsfromabout 30 per cent "Better off"against 20 per cent "Worseoff,"with50 per cent in the middlegroup; the deviationfrom100 between positive and negative percentages.Of the 8 expressesthe difference items (1-8) of attitudes and expectations,seven average more than 100, showingmore positive than negative response.The 7 buyingvariates (9-15) denotebuyingintentions.On all 15 itemspositivedirectionsare used to denote optimism,improvement,or intentionsto buy. The distributionsof the 8 attitudinal variables (1-8) differgreatlyfrom those of the 7 rare buyingintentions(9-15); this difference runs throughthe entireanalysis.The attitudinalitemshave fairlylarge middlegroups,comprising about one-thirdto one-half;and theytend to be balanced, with both sides ofmoderatesizes. On the contrary,the seven buyingitemsrepresentJ-shaped This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS FOR INDEXES FROM COMPLEX 515 SAMPLES distributions.The "certainor probable" buyersare ratherrare,the "slightly" inclinedor "uncertain"are even fewer,but nonbuyersare numerous.For example,a scoreof 14 mighttypicallyconsistof theseproportions:0.06 probable buyerswithscale value 200 (in otherwords,6 per cent with a score of 2); 0.02 possiblebuyerswith scale value 100; and 0.92 nonbuyerswith scale value 0; thus 12+2+0 = 14. Note that to estimatethe proportionof buyersone should divide these scores and theirstandard deviationsby two; but coefficients and relativesremainunaffectedthereby.Thus thefivevariates (numbered9, 12-15) with scores near 14 denote buyingintentionsof about 7 per cent; variate 10 with scores of about 32 representsexpected car buying by 16 per cent; and variate 11 with scores around 42 representsexpected home repairsby 21 per cent. Because the middle groups are small, the statisticalpropertiesof these scoresresemblethe simpleproportionsof buyers;thelatterhas oftenbeen presentedinsteadoftheformerin surveyresults. In column3 we presentforeach itemthe mean of6 standarderrorscomputed forthe six surveys.Thus we note foritem 1 a score of 110,subject to a standard errorof 2.32; both of these figureshave increasedstabilitybecause they are means of 6 separate computations.The standard errorsare fairlyuniform, mostlyin the range of 1.5 to 2.5 points.Neverthelesssome differences between them are appreciablylarge comparedto theirown reliability.To measurethe reliabilityof the standarderrorswe also computedstandard deviationswithin the sets of 6 values, and we presentthese in parentheses.The 6 values of standard errorsare not independententirelybut largelyso, because most sampling units are changed; the value of 1//6- 0.4 can serve as a roughapproximation forthe coefficient of variationforthe standarderrors;thus the standard error of 2.32 is subject roughlyto a standarderrorof0.4X0.14=0.056. of variations: the mean standard errors Column 4 contains the coefficients of column3 divided by the mean scoresof column2. The uniformity we noted beforein column3 disappearsherein column4: the seven (9-15) buyingintentionshave muchlargercoefficients ofvariationthan the attitudes(1-8). For the main sourceofthisdifference ofvariawe may look to the coefficient tion under unrestrictedrandom sampling. For binomials this would be \/(1-P)/Pn, and this becomes large for small P. For trinomialsit can be written,withP2, P1, and Po denotingthe proportions havingscoresof2, 1, and 0 respectively,as: /Var(2p2+ P) 'V(2P2 + A4P2 + Pi- V p1)2 (2P2 (2P2 + + P1)2n P2 + PO -(P2 - ~V (1+ P2 P1)2 -PO) 2 (2) -Po)2n When P2=PO the coefficient ofvariationbecomes V/(P2+Po)/n. The attitudes (1-8) approximatethis condition; furthermore, they tend to have values of of variaPI = P2+Po = 0.5 veryroughly.Under these conditionsthe coefficient tionis 0.7/Vn-;and forn = 1370 this comesto .019. However,forrareitemsthe of variation situationis different: P2 is small,P1 is even smaller,the coefficient This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions 516 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1968 approachesthe binomialmodel -/Po/(l -Po)n. For example,forthe score of 14 with P2=0.06, P1=0.02, Po-=0.92, we get a/(.98-.862)/(I.06-.92) 2n =3.50/N/n. (Note that for P0=0.92 the binomial standard errorcomes to 3.39/V/n;and for P0=0.93 (for a score of 14/2) it comes to 3.64/V/n.)For n = 1370thevalue of3.50/x/n .095; thisis 5 timesas highas the typicalvalue ofvariationofthe of.019 forattitudes.These factsexplainthe highcoefficients fiveintentionitems9, 12-15; items10 and 11 fareslightlybetter. Thus the 7 intentionstend to resembleasymmetricaldichotomies,whereas ofvariation the8 attitudesare fairlysymmetricaltrichotomies.The coefficients permitready comparisonsof the errorsof the diverseitems; they also point to of behavior between the 8 unimodal attitudes and the 7 the wide difference bimodal intentionsthat we shall note later in the errorsof relatives and indexes.2 Column 5 containsthe standard errorsof changes of scores (r2- ri) between two periods.Each entryis the mean of threecomputationsobtained fromthe pairs of surveysfor each of threeyears. These standard errorsare somewhat means (column3), but not a/2 timesas high, higherthan forthe corresponding whichtheywould be forindependentsurveys.We computedthe ratiosof standard errorsof (r2-r1) to \/var(r2) +var(rj); the 15 computedratioslay between 0.77 and 0.99, and average0.916 ratherthan 1; that is, se(r2-r1)/se(r)averages not 1.414 but 0.916 X 1.414= 1.295. We may considerroughlythat the average ratio0.9162=0.84 expressesthe effectof an average correlationofp=0.16. This correlationcomes fromusing the same countiesfor the pairs of surveys,not oftenthe same blocks,and neverthe same segments.Ihencethe correlationsare years. Larger correlahigherthan forpairs of surveysfromdifferent onlylittf-e tionscould resultfrompanel studies using the same segmentsand dwellings; thiswould be true formost attitudes,thoughnot all buyingintentions[Kish 1965, 12.5]. Economistsmay utilize the magnitudesof changes over a variety of time spans: last survey,last year, etc. The values of se(r2-r1) in column 5 can be accepted as reasonably good measures of the standard errorbetween any 2 surveysnot far apart. (They would naturally tend to increase slowly and slightlyto 1.414 times se(r) in column4.) This measure of the samplingvariabilityof each item should be comparedto some measure of the magnitudeof the item's fluctuationbetweenperiodicsurveys.A good treatment,involving multiplepurposes and models, is beyond the scope of this (already complex) paper. But interestingcontrastsemergefromthe most basic approach: from the scoresof the 17 surveysavailable in the years 1955-62 we computedstandard deviationsbetweenthe 17 surveyscores (column 6).3 The attitudes (1-8) though all had fluctuationsmuchgreaterthan the standarderrorofdifferences; variate2 fluctuatesless than the others,and variate 7 much more.In contrast, 2 I am not surprisedat the poor statisticalperformance of thesedichotomiesforrareitems.PersonallyI have longbeen opposed (oftenin speech,occasionallyin writing)to the preponderanceofdichotomiesin social research, and in favorofusingmorescales with3, 5 and morepoints.I welcometheinvestigationof 10 pointintentionscales by the Census Bureau [Juster,19661. ofall 15 itemswas, withsomevariation,about 3.5 as greatas thestandarddeviations s The rangeoffluctuation presentedhere.These resultsforthe range/s.d.are about what one could expect forsamples of 17 fromnormal distributions. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS FOR INDEXES FROM COMPLEX 517 SAMPLES the seven buyingexpectations(9-15) had fluctuationsonly about as great as hence mildfluctuationscould not individutheirstandarderrorsof differences; ally contributemuch new information.However, to conclude that these 7 itemsare uselesswould be too hasty; even these 7 scorescan detectlargerfluctuations,whichperhapsare moreimportantthoughrarerthan smallerfluctuations. Because the sample was clustered (and multistage), the standard errors actually computed in conformitywith the sample design were greaterthan would appear fromformulasproper only for unrestrictedrandom sampling. The ratiosof the actual standarderrorsto standarderrorsforthe unrestricted to the "designeffect";its values randommodel we denote as VDeff, referring werecomputedand theirmeans assessed foreach of the 15 scoresas 1.21, 1.16, 1.42, 1.24, 1.33, 1.07, 1.46, 1.28, 1.08, 0.99, 1.32, 1.14, 0.97, 0.97, 0.96. The average among the 15 values is 1.17; this representsan increaseof 1.172= 1.37 in the variance. Its chiefprobable sources are the clustersof about threeinterviews in somewhathomogeneoussegments;clustersof about 20 in less homogeneous counties; and perhaps the clustersof about 10 interviewsper interviewer [Kish, 1962]. We had also computed average values of -VDefffor the score changes,by takingthe ratio ofthe actual standarderrorsof (r2- ri) to thestandarderrorsof two independentunrestrictedrandom samples of the same numbersof interviews; forthe 15 items they were 1.07, 1.11, 1.28, 0.96, 1.08, 0.99, 1.36, 1.15, 1.04, 0.90, 1.25, 1.10, 0.88, 0.95, 0.95. Effectsof clusteringremain,but are not as greatas forthe singleratiomeans: averaging1.07 forthe standarderror,and 1.072=1.15 forthe variance. The reductionof the effect(1.15/1.37=0.84) is due to the positivecorrelationbetweenpairs of sample means, which reduces the standard errorsof differences; but an effect,thoughreduced,does not disappear. Our confidencein the magnitudesand relationshipsof these design effectsare enhanced by similaritiesto hundredsof computationson similar surveys.[Kish, 1965, 14.1 and 8.2]. 2. STANDARD ERRORS FOR RELATIVES AND THEIR CHANGES The relativeforany currentyear is the ratio of the score forthat year to the scoreforthe base year R1= ri/ro: = (yjxo)/(yoxi). R, = r1/ro- (y1/xj)/(yo/xo) (3) The resultsconcerningrelativesin this sectionwere computedforfoursurveys, 2 in 1959 and 2 in 1960; 2 surveysin 1956 serve combinedas the base period.The situationis similarforthe sums of relatives,the indexes,presented in the next section. Standard errorsforrelativesare computedwithformula(9"), foreach of the foursurveys;the meansofthe fourstandarderrorsappear in column1 of Table 2. The chieflesson again concernsthe strikingcontrastbetweenthe 8 attitudinal items (1-8) on the one hand, and the 7 rare items (9-15) on the other.The standard errorsfortheformerrange from1.6 to 3.1; forthe latter they range from6.0 to over 13. The standard errorsof the relativesresembleclosely the This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions 518 AMERICAN STATISTlCAL ASSOCIATION JOURNAL, 1968 JUNE TABLE 2.-STANDARD ERRORS OF RELATIVES (R1=r1/ro)AND CHANGES OF RELATIVES (R2-R1) Item Fluctuations se(R2 - R1) in 17 +/C2 V\[Var(R2)+ Var(RI)] Surveys CR se(R) se(R2-Ri) (1) (2) (3) (4) (5) 1 2 3 4 5 6 7 8 2.30 1.65 1.59 1.79 2.42 2.89 3.11 1.90 2.82 1.78 1.86 2.01 2.23 3.62 3.66 2.21 5.40 2.97 7.89 7.94 6.29 7.45 24.64 17.26 .87 .76 .82 .79 .65 .89 .84 .81 .88 .96 .94 .88 .95 .96 .97 .90 9 10 11 12 13 14 15 11.93 6.02 7.30 13.37 10.00 13.14 11.44 14.24 6.83 8.21 14.71 11.29 14.95 12.40 12.69 8.52 9.93 11.18 8.57 14.17 11.10 .83 .80 .79 .78 .79 .80 .77 .95 .90 .95 .98 .98 1.00 .98 Mean .80 + C2 rC0 +rC .944 of variation forthe means (column 4 of Table 1), forreasons discoefficients cussedlater. In column2 of Table 2 we presentstandarderrorsforthe changes (R2-R1) oftwo relatives.Each entryis the mean of six standard errorscomputed (with formula11) forthe differences betweenthe six possiblepairs of the fourperiods investigated.Note that theseentriesare onlylittlegreaterthan the correspondentriesin column 1 forthe standard errorsof separate relatives;the entriesof column2 divided by column 1 average 1.13, ratherthan \/2.The variances of are reducedby strongcorrelationsbetweenthe pairs of relatives the differences R2 and R1; these effectsmay be seen in the entriesof column4. Their average is 0.80, whichdenotesa variance ratio of0.64, the effectof a correlationof0.36 (strongerthan 0.16 in Table 1); a/1+1-2(0.36) =/1.28=1.13 on the standard errorof the change. We also obtained most reassuringconfirmations forour conjecturethat the varianceofthe changein the relatives(R2-R1) may usually be computedmore simplyin termsof the variance of change in the scores (r2-rl), takingadvantage of the approximationthat rO2 var(r2-ri) -var(r2-r1/ro) =var(R2-R1); see formula(11'). We computedthe ratiosse(R2-Ri)/ [r7lse(r2-rl) ], and averaged these over the six possible pairs foreach of the 15 items; all 15 averages werefoundwithin.014 of 1.00; therewas littlevariation,and the mean of the 15 was 1.00. Thus we can safely use this approximationwhen (r2-rl)/ro is moderatelysmall. This holds usually,especiallywhenthe standarderrorof the changeis mostneeded. Then we may computesimplythe standarderrorofthe This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS FOR INDEXES FROM COMPLEX 519 SAMPLES change in means, instead of the more difficult standard errorof the change in relatives. Searchingalso for a simplerapproximationforthe variance of R, we won- 2prroCrCro. We Prrowas in CR-C +C deredhowimportant the correlation found that the ratios [CR/(C +C0)r]112 varied only between 0.88 and 1.00, averaging0.944 (column5). It seems that usingthe moreconvenientvalues of (C1,+ Cr)1/2resultsin overestimating CR, the coefficient ofvariationof the relative, on the average by about 5.5 per cent. (This effectof a correlationof 0.12 agrees with the slightlygreatercorrelationof 0.16 found between successive surveys.) This is an empiricalrelationshipwhich depends on the stabilityof small correlationsPrro betweenthe base year and a seriesof periodic surveys, and forvariousitems. In addition to providingconvenientapproximationsforthe standard errors ofR = r/roand (R2- R1),the above relationshipsalso have relevanceforthe design of the base ro.It is usefulto know that ordinarilyneitherthe variance of ronor the correlationPrro have importanteffectson the variance of (R2-R1), althoughthe correlationPr2r, does. On the otherhand, decreasingthe variance of ro and increasedPrro would reduce the variance of R. Here I may add two practical and personal considerationsregardingrelativesin periodicsurveys. First, the change (R2-R1) is probably of greaterinterestthan the relative itself.Secondly, although Pr2r,may be increasedwith judicious overlaps betweensuccessivesurveys,it may be difficult to do this for Prro when the base periodis farremovedfromthe currentyear.* 3. STANDARD ERRORS FOR INDEXES AND THEIR CHANGES Table 3 presentsresults regardingfive indexes in separate columns, each indexbeingthe mean of severalrelatives;theytake the form 1 J 1= -ERi Ji and fluctuatearound 100 as the relativesdo. The firstindexis the mean of the relativesforthe 6 attitudes(1-6); the second index containsonly items9 plus 10 (home and car buying);the thirdis the mean ofthese 8 items.The fourthis themean ofthe4 intentionsto buy appliances (12-15); thefifthadds item9 and 10 to them;thesetwo indexeshave not been actuallyused. The standard errorsare on line 2, each the mean forthe fourperiods,computed with formula(13). The resultshere conformto the resultspresentedin Table 2 for the separate relatives: the index for items 1-6 has low standard error;that for items 9 plus 10 is much higher-so much higherthat adding them to the first6 itemsincreasessubstantiallytheirstandard error.Indexes * The largeaverage correlationof 0.36 foundin column4 is narrowlyempirical;we mustbe interestedin its source.The base rois the mean oftwosamples; and C2, is about 0.5 C2. (An increaseto 0.58 by correlationof 0.16 betweentwosampleshappensto be counteracted,because the roin var(ro)/r2 are a littlehigherthan the average r in the other4 surveys.) Thus approximatelyC ,+C2 =1.5C2 =3Cdo; and we found above C2 -_0.89(C2+Cr), where0.89 =0.9442fromcolumn5. But we also foundthatVar (R2-Ri) -ro Var (r2 -rl) -ro2 2 Var (r) 0.84 (0.84) 2 C2 on theaverage,because Pr2r=0.16 on theaverage,and inasmuchas R =r/ro 1 on theaverage.Hence, Var (R) -CR-=0.89 (1.5 C2) =1.33 C2 n the average; then Var(R2-R)/[(Var(R2)+Var(RO) I-0.84/1.33 =0.63 =0.802. This explainsthesourcesoftheaveragevalue of0.80 foundin column4. The varianceofthe base rocontributesto the denominatorbut not to the numerator. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions 520 AMERICAN STATISTICAL JOURNAL, ASSOCIATION JUNE 1968 E R,/J) TABLE 3.-FACTORS RELATING TO THE INDEXES (I AND INDEX CHANGES: I2 -1 =E R2j/JR1I/J Items in Index 1. Fluctuations(s.d.) of I in 17 Surveys 2. StandardErrorsof the I 3. Effectsof Correlationson Var (I) 4. Mean Correlationsamong the Ri 5. Standard Errors of Change (U2-lI) 6. Var (I2-Ii)/ [Var (12)+Var (II)l 7. Ratio ofVar to SimpleApproximation 8. Effectsof Correlationson Var (I2-II) 12-15 12915 4.63 2.110 7.62 7.714 5.010 5.505 1.129 1.372 1.626 1.796 0.161 0.174 0.119 0.213 0.152 1.295 0.626 8.199 0.643 2.381 0.628 8.922 0.661 6.217 0.635 0.992 1.035 1.029 1.007 0.996 1.657 1.080 1.269 1.747 1.816 1-6 9, 10 4.16 1.165 8.33 7.171 1.754 1-6, 9, 10 based on the highlyvariable items12-15 would also have highstandarderrors. On line 3 we investigatethe increaseof variance due to the correlationsbetweentherelativescomposingan index,computedas J Var(I')/J-2 EVar(Rj). j In thefirstcolumnwe have 16 Var( I?R)foritems1-6 dividedby IR 16 EVar(Rj), 36 i and findthis ratio to be 1.754 (actually the mean of such ratios computedfor thefourperiods).In the absence ofcorrelationbetweentheR's, thisratiowould be unity;the indexwould have a variance1/6 as largeas a singlerelativeon the average.Actuallysince the ratiois 1.754,the variance of the indexis smallerin in the the proportion1.754/6 1/3.42; the correlationsreducethe information 6 itemsfrom6 to the equivalentof3.42 items. These considerationsled us to computean approximateand syntheticcorrelation coefficient as: rs2 Var(E R;) [ - E 2- Var(Ri) Var(Rj) + j + 2 I Pi ? + L - j- p. ( The results,the means offourcomputations,appear in row4. The averagingin the last termis justifiedonly to the degreethat the variances are about equal. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS FOR INDEXES FROM COMPLEX 521 SAMPLES This conditionobtainsfairlywell forthe firstindex; also forthe fourthand fifth indexes;it clearlyfails forthe thirdindex which mixes the low variance item 1-6 withthe highvariance items9 and 10. The increaseof the variance due to positive correlationscan be put in the form1+p(J - 1); and the mean of J correlateditems will have a variance 1/J+p(J - 1)/J. It is encouragingto note that the average correlationbetweenthe 6 attitudes(1-6)-also betweentheir changes-is only about 0.16. This should overcome any suspicion that the severalquestionsmerelyyieldthe individual'ssame generalfeelingofoptimism or pessimism.It is also interestingthat the buyingbehaviors(12-15) are positivelycorrelatedwith an average of0.213; this may be due to generalfinancial ability,or to buyingsituations,influencedby positionsin the lifecyclesof individual consumers. These positive tendencies appear to overcome negative tendencieswe may suppose resultfromitemscompetingwithinbudgets.These points could be better studied by noting the correlationsfor indivudal consumers. Rows 5-8 of Table 3 contain results about changes in the index (I2-Il) betweentwo periods.Each of the entriesagain representsthe mean of six computationsforthe six possibledifferences betweenfourperiods.The outstanding fact to note in this table is how well preservedare the several usefulrelationshipswe had notedbefore,eitherforthe singleindexorforthe changes(R2-R1) ofsinglerelatives.Note on line 5 that the standarderrorsofdifferences are only slightlygreaterthan the standard errorsof individualindexes on line 2. This relationshipis similarto that foundbetweenthe standarderrorsforthe relative R and forthe changes (R2-R1). Again thisphenomenonis due to the highcorrelationbetween12 and I1. Its effectis measuredby the ratio ofthe Var(12- I1) to the sum of the variances ofI2 and I,; note on line 6 that these come close to 0.64 = 0.802,the value we foundas the average in column4 ofTable 2. We findagain, as we foundforthe difference (R2-R1) of relatives,that the variance of (I2-I1) can be approximatedvery well with the variance of the mean of score changesover the base roi;the entrieson line 7 give the ratios of the variance of (I2- 1) to the approximatevariance of ( J (r2j - rj)/rojJ), wherethe values of rojare treatedas constants.These values of rojare known in advance; used as weightsin formula(14") this approximationleads to easier computationsfor the complicatedvariances of (I2-Il), which are probably more needed than those forthe indexesI. Furthermore,the easy approximations forthe variance of I2-I1 can also lead for easy approximationsfor the variance ofthe indexesI, to the degreethat the factor0.64 ofline 6 can be consideredstable. Finally, on line 8 we investigatethe increaseof the variance due to correlationsamongthe J different changesof (R2j -R1.), the sum ofwhichis the index change (U2-11), computed as Var(I2-I1)/J-22 Var(R2j-R1j). The increase seen here forthe variances of index changes are similarto the increasesnoted on line 3 forthe separate indexes. 4. SOME CONCLUSIONS, LIMITATIONS, AND SPECULATIONS This was not a theoreticalinvestigationwithan empiricalillustration;rather it focusedon a connectedset of practical problemsof fairsize and generality This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions 522 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1968 with whatevertools we could find,invent, and improvise.It explored some analyticaluses ofcomplexsamples,and devisedmethodsfordealingwiththem. It expands methodsof surveysamplingbeyond the estimationof means and totals,withwhichits literaturehas been preoccupied.By concentratingon the samplingerrorsof economicindicatorsand indexesderivedfromperiodicsample surveys,we could exploreat reasonabledepthyet withbrevitytheirnature and sources.The resultshave directimplicationsand providea base forfurther investigations. Most importantare the results showing the good performancesof the 8 attitudinalvariables. They were obtained with precisionsadequate for such smallsamples (1370 interviews),and probablyfortheirpresentpurposes.Their are particularlygood for measuringchanges. Furthermore,the performances cumulativeinformationthey provide when summed into an index is particuthe correlationamongthe 6 itemsin the indexis low enoughto larlygratifying; equal 3.4 independentvariables. This raises questionsabout the possibleincorporationof morevariables. Much ofthe value of the 8 attitudinalvariablesis due to apparentsuccess of of variation. They are theirsimple scales to yield reasonablylow coefficients essentiallytrichotomieswith the middle groups somewhat larger than the sides; they appear to be considerablybetter than dichotomies,so commonin research.But a scale with5 or morepointsmay be foundwithinvestigationto be betterstill. of variation, The 7 buyingintentionsare shownto possess large coefficients because these approximatethe formV/(1-p)/pn fordichotomieswith low p values. They are usefulonlyfordetectinglarge changes.How could theirsensitivitybe improved?A drastic increase in sample size may be feasiblefor us only with differentfield methods,perhaps with auxiliary mail or telephone responsesto reduce expenses.It is intriguingto speculate about possibilitiesof increasedsensitivityby obtainingscaled responses,whichhave improvedcorrelationswithunderlyingdistributionsof actual buyingprobabilities.I believe thatresponseresearchcould uncoverpracticalmeansforgettingmoreinformation, especially to uncover distinctbuying classes withinthe large group of declarednonintenders.4 Concentratingon sampling errors,we cannot deal with the related basic issuesof the validityand the predictivevalue of the economicindicators.First come the choice of variables and methodsforobtaininganswersfromrespondents,discussedbrieflyabove. Second, the constructionof the relatives and indexes may be scrutinized. Could the indexesbe improvedby optimalweights,utilizingcumulatedinformation about the predictivevalue as well as the sampling precisionof the several variables? Furthermore,the model for constructingthe index can be 4 The U. S. Census Bureau has conductedfrom1960 to 1967 muchlargerquarterlysurveysusingthe expectationsquestions9 to 15 [1967]. Investigationsof 10 pointscales forintentionsare now being put into effectby variables that would con[Juster,1966]. I believe that empiricalpanel investigationscan also disclose different to thepredictivevalue ofintentionvariablesand ofindexes.Some ofthesemaybe itemsofpast tributesignificantly status.The variablesused forpredictingvotingbehavior and life-cycle consumerbehavior,othersofsocio-economic ofsimilarmethods[Campbellet al., 1960,Stokes 19661. provideillustrations This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS FOR INDEXES FROM COMPLEX 523 SAMPLES the subject of furtherinquiry.I believe that compositescoresforindividuals, derivedfromhis several answers,could have analyticuse. Third, the nonsamplingerrorsof responseand nonresponsecan be investigated. It has been shownthat interviewer varianceforattitudinalvariables can be held to reasonablelimits [Kish, 1962]; the low "designeffects"in thisstudy tend to supportthose results.Moreover measurementbiases, that may seriously affectthe absolute validity of the variables, may still have negligible effecton relativesand changes,to the extentthat thosebiases are held constant over surveyswith good process control.This is probably the primaryreason forthe use of relativesand indexes; they have meaningand comparabilitybeyondthescoreson whichtheyare based. Fourth,should greateruse be made of reinterviews(panels of households), and of alternativesto householdinterviews?The answersdepend not only on samplingvariability,but also on fieldproblemsand procedures.Reinterviews involveproblemsof responseand nonresponse,due to the multipleuses of the same surveyrespondents.It seems likely that the standard errorsof changes could be reduceddrasticallywithjudicious overlapsofthe comparedsuccessive samples.On the otherhand, overlapswiththe base year appear neitherfeasible nor needed. 5. VARIANCES FOR COMBINATIONS OF RATIO MEANS The ratio mean used in these surveysand in many othershas the form(1) explainedin section1; its varianceis estimatedby var(r) x [var(y) + r var(x) = - 2r cov(y,x)] = xE dzh E2[dzh/x]2, (4) h where 2 dZh - ah ah - ah E [(Yha 1 a - Yh/ah) Xh/ah) ]2. r(Xha - (5) Similarto the variance,the covarianceoftwo scoresr1and rk,based on samples fromthe same primaryselections,is: cov(rj, rk) = > h ) (3h = X (XjXk) 'E h dzjhdZkhd (4') Derivationsand justificationsof these simplifiedsingle-stagecomputationsfor multi-stagesamples may be found in several places [Hansen, Hurwitz, and Madow, 1953; Kish and Hess, 1959; Kish, 1965]; finitepopulationcorrections forsamplingwithoutreplacementsare trivial and neglected.Often,as in our sample, two primaryselections(a and b) are drawnfromeach stratum;with ah = 2 the basic computingunit takes the form dzh = [(Yha - Yhb) - r(Xha - Xhb)]. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions (5') 524 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1968 Two replicationsper stratumhas long been a basic design in the practice of surveysampling,because it permitsutmoststratification forrandomizedmodels ofsimplevariance estimates.It has been exploitedby Deming [1960], Kish and Hess [1959],Kish [1965]; its simplicitywas emphasizedand generalizedby Keyfitz[1956], including(6) below. However, contraryto some currentopinion, paired selectionsare neithernecessarynor sufficient for simple variance computations.First,the computingadvantage of ah =2 over ah> 2 consistsof the type of advantage (5') has over (5); furthermore forah> 2, it is possible to formah(ah- 1)/2 paired contrasts. Second, whetherah =2 or not, simple computingformsmust be based on quadratic formsconsistingof simple additive termsof replicationaggregates. Theirneglectmay explain partly,I believe,the lack of variance computations forlinearcombinations,indexes,etc. in surveyliterature.The computingforms belowhave the desiredsimplicity,and are mutuallyadaptable forlinearcombinations,relatives,indexes,and theircomparisons;eitherforthe desk computers we used in 1961 or the electroniccomputerswe would use in 1968. For example, either(5) 0r (5') permitreadilythe computationof variancesforlinear combinlationsZWjrjofseveralscores,wherethe Wj are constants: var( Wjir) = w E i h E E ) (wi?) (Wj Xj j<kh Xj Xk ~~ (WjdZ?)2+Wj (w#)(WkdZk) E ~[zEW - h + 2 (6) dzjh]2 j Xj The finalformincorporatesall variancesand covariancesofcombinedvariables withinstrata. An example of (6) is the changeof score(r2-rl) fromperiod 1 to period2; hereJ= 2 with W2= 1 and W = -1: var(r2 ) - ri) E(d dZ2h h - X2 2(x2xl)1 2 dZlh =d 2 h2 d2 2 h X1 2 dZ2? + l 2Ed h zlh (7) E dZ2hdzlh. h Anotherexamplewould be the simplesummedscore: Jr =E 8ha i whenall wj = 1,all xi = x; and E rj = E yj/x, r var ( i yj/x) x-x2 E ( j h dZjh)2 = j X 2 Yjha= (8) E (dsh- Jrdxh)2. h The relativeforany currentyear is the ratio of the score forthat year to the This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS 525 SAMPLES FROM COMPLEX FOR INDEXES score for the base year, RI = ri/ro: R, = ri/ro = = (y1x0)/(0x). (y1/x1)/(y0/X0) (3) The bases, ro,and yoare generallyset safelyfarfromzero,sometimesby setting ro= 100 arbitrarily.The values of x in the denominatorsof all scores must be such that, comparedto samplingvariability,we can disregardthe possibilities ofvalues near zero. This requirementis met readilyin our sample,wherethe x values denote sample sizes under good control [Kish, 1965, 7.1]. We treat = ri/rosimplyas the ratio of two random variables, the variance of which may be estimatedwith the same formula(4) that we use forr= y/x,then we have -y [var(ri) + R var(ro) var(R1) - 2R1 cov(r1,ro)]. (9) ro This bringsour onlyseriousnew theoreticalproblems.One approachis through methodscalled "propagationof errors"or "delta method." Cramer [1946] and C. R. Rao [1952] give classical presentations;the oij./Vn termsof the covariance matrixneed to be generalizedto accommodate the covariance termsof complexsampling.The treatmentofKendall and Stuart [1958] is moregeneral. Note the contributionsof Hajek [1958], and Brillingerand Tukey [19643, or descriptionsin Deming [1960,390-396], and Kish [1965,12.11 and 14.3]. We must and can be satisfiedwith a large sample approximation;most surveys are based on large enough samples. The derivationsused to obtain (9) need few additional restrictions.Mean square errorscan be used instead of variancesforri and ro,and small biases will not destroytheserelationships.We investigatedand foundthese biases to be trivialor absent. The bias of R1 was of variationof foundto be real but negligible;most important,the coefficient the denominator,Croy is small enough(Section 6). When yi z 0 can be safelyassumed,the variance ofthe "double ratio" can be expressedin the symmetricalform: var(R ) = R2 2 E dyo dXO\]2 /dy1 dx1\ [dy- h - X1 Y h) Yo - j Xo (9') This form,withoutstratificationis attributedby Yates [1953 and 1960] to it. Keyfitz,and derivedin detail by Rao [1957], who has been reinvestigating It was mentionedbrieflyby Hansen, Hurwitz,and Madow [1953,I, 514], with a simpleextensionto 11ujl/lvj,but thevalidityofthe approximationforproducts of several variableswould requireappropriatechecks. The followingformof the variance is preferableforcomputingconvenience, and because it avoids divisionby yi,whichmay be small and unstable.Its first expressionlinks(9') to (9): var(Ri) r-2E h dZlh X1 R1 Xo ) = E (elh- Rle0h), h This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions (9") 526 AMERICAN whereelh = (dylh - dzlh ~~= STATISTICAL rldxlh) , Yo JOURNAL, dyoh - dzoh an1deOh = rox1 rox1 ASSOCIATION - JUNE 1968 rOdxOh - Yo (10) The computingunit (dZh/X) is commonto precedingformulas(5-8), and the of two relasimplederivedunit eh is basic to (9-14) forindexes.The difference tives measuresits change betweentwo periods and its variance appears term by termas: var(R2 -R1) - - h [(e2h - Zh (e2h - elh)2 rO + (R2 - h var(r2 - h eoh (11) elh) + (R?2-R1)2r0 r) -2~~~~~~~~ Rl)ro [cov(ro, r2) -2(R2- RleOh)] R1)2 R1) Ej eoh(e2h- - 2(R2 - (elh - R2eOh) - - var(ro) cov(ro, r1)]. (11') The last expressioncan also be obtained immediatelyby considering(R2-R1) to andapplying (4) directly = (r2-rl)/ro as theratiooftworandom variables, thefirst term, approximation: thatratio.Theexpression (11')leadstoa useful whichcan be computedwithsimplermethods(7) and with data onlyfromthe currentsurveys,provesto be an excellentapproximationforthe entireexpression. Thus the easily computedvariance fora change in scores can be used to varianceofa changein relatives. estimatethe moredifficult The variancefor(R2-R1) is but a specificexample (J = 2, W1= 1, W2 = -1) ofvariancesforlinearcombinationsZjWIRV ofrelatives: var E WjR E [E Wj(ejh - Rjeojh) (12) This resultfollowsreadily,as (6) did, fromthe simple additive quadratic Squaring the brackets yields all formsof the computingunits (ejh-Rjejh). Our present interestis in the strata. terms within variance and covariance all = with is I whose variance (12) index EjRj/J, Wj =1. This is the easier comto the wanted also sum of variances fromthe separate putingform,but we relatives(3'); we used: between correlation the covariancesto estimate average ( - var(Ri)+ 2 E cov(R1,RIk) Var , Ri) j<k = S h E j (ejh - Rjeojh)2 + 2 E h j<k (13) (ejh - Rjeojh) (ekh - Rke0kh). of two indexes [12-11]; this Finally,we want the variance of the difference is (12) withall w2j = 1 and all wj= -1: This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS var( E2R2Y - E2Rii) - [ [>2{ EE2 h - >2 h - - E2{(e2jh _ i [ >2 j R2jeO3h) - (e2jh- elmh)- (e2jh - [ 2> E (e2jh - elh)] (eljh - RlfeOjh) }] ~~~~~~~~~~~~~~~~2 (14) (R2j- R1j)eojh} E + elih)] 527 SAMPLES FROM COMPLEX FOR INDEXES h [ [ E j (R2j - Rij)eoj] (R2j - Rij)eoj] (14') The threetermscorrespondto the analogous termsin equations (11) forthe variance of the differenceof relatives,but comprisethe summationsfor all J) relatives.The firsttermconsistsof the variances and covari(j= 1, 2, ances of the score differences(r2j-r1j) for the differentrelatives (j= 1, 2, * J), and includingthe factorsrWj2withoutsamplingvariation, i.e., as a constantforeach j: >2 [ >2(e2jhh i elih)] >2roj var(r2J i + 2 rij) >2rIj'r1 cov[(r2- j<k r), (r2k - r1k)] (14"1) We find(line 8, Table 3) that the covariances among the change of relatives are importantand that they are in proportionto the covariancesof the relatives,the second termof (13). But we also find(line 7, Table 3) that thisentire firstterm(14") is an excellentapproximationforall the threetermsof (14') for the variance of index changes-just as the firsttermof (11') was forthe variance ofchangesofindividualrelatives.Thus the second and thirdtermsof (14') can be neglected often-when the factors (r2j-r1j)/ro are uniformlysmall enough. 6. TECHNICAL NOTES Whenusingstandarderrorsforthe ratio mean r = y/x, one may be interested in its bias. These werecomputedand all foundto be negligible,muchless than 1 per cent of the values oftthestandard errors,and appeared, when tested,to vary haphazardlyaroundzero. This was expectedin lightof previousresearch, ofvariationof about .04 forthe sample and in lightofthe computedcoefficient size x. [Kish, Namboodiri,and Pillai, 1962,and Kish, 1965,7.1]. We expectthe relative R = r/ro to have a ratio bias, because the bias is known to have this bias ratioto thestandarderror:Bias (R)/S.E.(R) =-PRrOCro. Sincewe expect the correlationPR,O to be negative,we shouldget a positivebias. We can also exofvariation pect its ratioto the standarderrorto be less than Cr0,the coefficient the base scorero. of Bias effectswere computed,and presented [Kish, 1962b]; we shall merely summarizethem. 1) The correlationpR,rwas large; it averaged -0.52, ranging This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions 528 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1968 from-0.37 to -0.61 forthe 15 variables. Because of this the existenceof a positivebias could be measured. 2) Neverthelessthe bias was negligible;the bias ratiosrangedfrom0.49 to 1.06 of 1 per cent forthe 8 attitudeitems,and 1.92 to 5.00 of 1 per cent forthe 7 expectationitems. 3) The bias ratios were keptlow by low values of Cro.These rangedfrom1.04 to 1.88 of 1 per cent for the 8 attitudeitems,and from4.12 to 8.32 of 1 per cent of the 7 expectation items.The bias ratio of r/rocan be kept low by keepingCr0low. 4) The two biases tend to cancel fromthe difference(R2-R1) of two relatives. 5) When summedin indexes the bias ratios tend to grow,because the biases sum up, whereasthe standarderrorsdo not increaseas fast.But the worstcase we could findis forthe six-itemindexin column5 ofTable 3; the ratio ofrelativebias to of variationis 0.438/5.505=.080; this would increasethe mean the coefficient square errorby onlythe factor1+.0802 = 1.0064. 6) The predictabilityof relationspermitsadjustmentwhenit becomesdesirable. In each situation to estimate an average standard errorwe computed the mean of the standard errorsratherthan the square root of the mean of the variances.Thus we accept a small bias in orderto avoid a largerrandomerror in the estimatedaverage. In column3 ofTable 1 we not onlygave themeans ofthesixreplicatecomputations of standard errors,but also in parentheses the standard deviations fromthe betweenthesix values foreach. We took advantage ofthe information repeated computationsperformedon what are essentiallysix replicationsto ofvariationforthe standarderror.(That the obtainestimatesofthe coefficient replicationsare not independenthad no importanteffecton these computations.) The computationof the standarderrorsshould be subject theoretically and approximatelyto a coefficientof variation of about 1/V/2X45=.106 because the 45 pairs of comparisonsresultin about 45 degreesof freedom.We shouldactuallyexpectsomewhatmore,because of probableskewness,of f2> 3, and because the 90 primaryselectionsvary somewhatin size. The 15 computed values vary from.06 to .20 and they average .128. We should expect this to be an overestimatebecause ofvariationsbetweensurveysin measurementsand in of variationapplied to the values of VDeff sample design.A similarcoefficient is fortuitous,though averages.120, also varyingfrom.06 to .20; the difference in the expected direction;we expect VDeff to vary somewhat less than the samplingerrorbecause herethe variationsin sample size and in the level of the variatesbetweensurveysare largelyeliminated.But we can use .12 as a workofvariationofthe standarderrors. ingestimateofthe coefficient We also took advantage of the fourreplicatedcomputationsof the standard ofvariations errorofeach relativeto estimatetheirvariability.The coefficients of the standard errorsaveraged .118, resemblingclosely the results for the scores. REFERENCES Variances,Moments,Cumutants, D. R. and Tukey,J.W. [19641,Asymptotic [11 Brillinger, and OtherAveraaeValues (Princeton:Memorandum). [2] Campbell,A., Converse,P. E., Miller,W. E., and Stokes,D. E. [19601,TheAmerican Voter,New York: JohnWileyand Sons,pp. 72-4. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions STANDARD ERRORS FOR INDEXES FROM COMPLEX SAMPLES 529 [3] Cramer,H. [1946],MathematicalMethodsofStatistics,Princeton:PrincetonUniversityPress,28.4. [4] Deming,W. E. [1960],Sample Design in BusinessResearch,New York: JohnWiley and Sons. [5] Hajek, J. [1958], "On the theoryof ratio estimates,"Bulletin of the International StatisticalInstitute,219-26. [6] Hansen, M. H., Hurwitz,W. N. and Madow, W. G. [1953],Sample SurveyMethods and Theory,New York: Wileyand Sons,VolumesI and II. and Purchase Probability,"Journalof [7] Juster,F. T., "ConsumerBuyingIntenitions theAmericanStatisticalAssociation,61, 658-98. Ann Arbor:Institutefor [8] Katona, G. and Mueller,E. [1957],ConsumerExpectations, Social Research. Society,New York: McGraw-Hill. [9] Katona, G. [19641,The Mass Consumption [10] Katona, G., Mueller, E., Schmiedeskamp,J., Sonquist, J. A. [1966], 1966 Survey ofConsumerFinances,Ann Arbor:InstituteforSocial Research. [11] Katona, G. [1967], "Anticipationstatistics and consumer behavior," American Statistician,21, 12-13. [12] Kendall, M. G. and Stuiart,A. [1958], The AdvancedTheoryof Statistics,London: Griffin and Company,VolumeI, 10.6. [13] Keyfitz,N. [1957],"Estimatesof Sampling VarianceWhereTwo UnitsAre Selected fromEach Stratum,"JournaloftheAmericanStatisticalAssociation,52, 503-10. [14] Kish, L. and Hess, I. [1959[,"On VariancesofRatios and TheirDifierencesin MultiStage Samples,"JournaloftheAmericanStatisticalAssociation,54, 416-46. [15] Kish, L. [19621,"StudiesofInterviewerVarianceforAttitudinalVariables,"Journal oftheAmericanStatisticalAssociation,57, 92-115. [16] Kish, L. [1962b],"Variancesforindexesfromcomplexsamples,"Proceedingsof the Social StatisticsSection,AmericanStatisticalAssociation.190-99. [17] Kish, L. Namboodiri,N. K. and Pillai, R. K. [1962],"The Ratio Bias in Surveys," JournaloftheAmericanStatisticalAssociation,57, 863-76. [18] Kish, L. [1965],SurveySampling,New York: JohnWileyand Sons. and BuyingIntentions,Manuscriptto be [19] Maynes, E. S. [19681,ConsumerAttitudes publishedby the InstituteforSocial Research,Ann Arbor,Michigan. [20] Mueller,E. [1963], "Ten years of consumerattitude surveys: Their forecasting record,"Journalof theAmericanStatisticalAssociation,58, 899-917. [21] Rao, C. R. [1952],AdvancedStatisticalMethodsin BiometricResearch,New York: Wileyand Sons, 58.1. [22] Rao, J. N. K. [19571,"Double ratio estimatesin forestsurveys,"JournalofIndian Statistics,9, 191-204. SocietyofAgricultural [23] Stokes, D. E., [1966] "Some dynamic elementsof contestsfor the presidency," TheAmericanPoliticalScienceReviews,60, 19-28. [24] U. S. Bureau of the Census [1967],ConsumerBuyingIndicators,CurrentPopulation Reports,SeriesP-65. [25] Yates, F. [19601,SamplingMethodsforCensusesand Surveys,ThirdEdition,London: and Company,Ltd., pp. 343. CharlesGriffin This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 11:52:52 AM All use subject to JSTOR Terms and Conditions
© Copyright 2026 Paperzz