Empirical results from the 1980 Census sample estimation study

Tags: array, SMSE, estimation methods, test results, procedures, cell method, standard error, standard errors, pseudo-states, items, estimation procedures, sample records, Labor Force, item, Labor Occupation Education Composite, Poverty Poverty, composite indices, weighting, study population, estimation procedure, estimation method, John H. Thompson, undersampling, pseudo-state, quantitative analysis
Content: Empirical results FROM THE 1980 CENSUS SAMPLE ESTIMATION STUDY
Jay Kim, John H. Thompson, Henry F. Woltman, (U.S. Bureau of the Census) Stephen M. Vajs (U.S. Coast Guard)
I. INTRODUCTION An empirical investigation of a number of pro- posed estimation methods was conducted to determine an estimation procedure for weighting the 1980 census sample. The study design is briefly outlined as follows (for more detail see (1))- A study universe was created from the 1970 census sample records for three pseudo-states. The universe was then divided into weighting areas. In each weighting area, all possible samples were obtained according to the anticipated 1980 census sampling scheme. For each study population sample, the records were weighted utilizing each of the proposed estimation methods. For each method, the actual standard error, bias, and root mean square error (SMSE) were calculated for a variety of data items based on all possible samples within each weighting area. These statistics formed the basis for the comparison of the proposed estimation methods. In this paper, the results of the comparisons of the estimation methods is reported for population data. II. BACKGROUND 2.1 Proposed Estimation Methods The proposed estimation procedures can be classified into three basic types'(1) raking ratio, (2) post-stratified or cell-by-cell and (3) the inflated sample mean or "single cell"estimators. The raking ratio procedures for population estimation are based on post-stratifying the persons sample into an array defined by variables collected on a 100-percent basis. These arrays are called "weighting arrays" and four of them were tested using the six collapsing criteria given in Appendix 2. A weighting array is given in appendix I, and the other arrays which are modified versions of this one are described in (I). The cell-by-cell procedures are based on a stratification of the rows or columns alone of a weighting array. I/ 2.2 Population Characteristics Considered For each of the three pseudo-states, the estimation procedures were compared for 58 population characteristics, consisting of 7 poverty, 12 income, 8 labor force and 31 other items including items of education, industry, occupation, school enrollment and work status and women aged 35-44 ever married. 2.3 Study Population As mentioned in (I), the study population consists of the 1970 census sample for three pseudostates. The three pseudo-states are state 75 (Texas counties alphabetically from Erath through Loving); state 97 (California counties alphabetically from Madera through San Diego); and state 98 (California counties alphabetically from San Francisco through Yuba). It should be noted that samples created from state 75 had an induced undersampling problem. That is, some sample records were intentionally dropped from the sample to simulate an undersampling situation for state 75. The undersampling rate was subsequently doubled to test the effect of severe undersampling and the state was reprocessed. The materials for this second processing are referred to as state 76.
III. METHODS OF ANALYSIS 3.1 Thompson-Willke Test The Thompson-Willke Test (4) is a non-param- etric multiple comparisons test. It is based on examining rank sums and determining which are larger or smaller than expected. The rank sums were produced for 174 items of interest, 58 from each pseudo-state. To calculate the rank sums, observations (e.g., SMSE, bias or standard error), one for each estimation method, were ranked from the lowest to the highest in each weighting area. A rank sum was obtained by summing the rank for a method over all weighting areas in a state. If the rank sum of a method for a given item is significantly high or low at a given significance level compared to the expected rank sum under the null hypothesis, then the method is flagged. Flagging is done by the symbols + and - for significantly high and low values, respectively. That is, flagging implies rejecting the null hypothesis based on the two-sided test. The flagging constitutes the first step of the analysis and the examination of the number of extreme ranks observed for each method constitutes the second step of the analysis. The Thompson-Willke test was performed at = .2 for the 58 items in each of the three states. The following observations can be made concerning the test: (I) The value of ~ = .2 or Pr(Type I error) is the maximum probability that one or more of the "I" estimation methods could have been erroneously flagged as having an extreme rank from any one ~ompson-Willke test. However, the maximum probability that a specific estimation method could have been erroneously assigned a + or - as a result of this test is less than .01 (see appendix I). (2) By assuming independence among the 174 items or 174 independent tests and the probability (P) of assigning a + or - being .01, the Binomial Distribution may be used to examine the number of +'s and -'s assigned to a giv6n method. The critical region is 5 or more +'s (or -'s) using ~' = . 0 5 , (3) The 58 data items within each state were composed of four basic types of items--poverty status, income, labor force and others. The number of items, respectively, in each group was 7, 12, 8 and 31. The above procedure was also applied to test the number of +'s or -'s that an estimation method was assigned via the Thompson-Willke procedure for each estimation method. The cutoff points with a' = .05 for the comparison over three states were 2, 3, 2 and 4 for poverty status, income, labor force and others, respectively. It should be noted that state 76 is included in the Thompson-Willke test applied on the bias only. 3.2 Quantitative Measurement of Differences The Thompson-Willke test helps isolate estimation method(s) which is (are) either significantly better'or worse than the others but has nothing to do with comparison of those others or the magnitude of the differences. The quantitative measurement of differences is designed to fill the vacuum and give some indications about
170
merits of one estimation technique compared with the others. This methodology is based on an index which is essentially the average relative efficiency. Due to the massiveness of the data available for the quantitative analysis, the six collapsing criteria were ignored for the quantitative analysis and a single value was calculated representing each array. Hence, nine estimators, including four basic raking methods, four cell-by-cell procedures and the single cell method, are compared. They will be compared based on the median and maximum SMSE's and Standard errors calculated for each estimation method over all weighting areas. Two indices, one each for median and maximum were claculated for each weighting method. The index is the average SMSE or standard error efficiency of weighting array IP relative to the others. The rationale for the choice of weighting array IP as a base value of the index is that there were preliminary indications that the estimation method is slightly superior to the others. Finally, a composite index was calculated by taking the average of indices for all categories (Table 8 in Appendix 5). I t is noted that state 76 is not included for the quantitative analysis. IV. RESULTS 4.1 Results from Thompson-Willke Test 4.1.1 Based on SMSE i) General Findings As seen in Table 2 of Appendix 4, the cell-bycell along with the single cell procedures except for the age-sex-race-origin control (.'col') method perform most poorly in that they have significantly high rank sums for many items, while for only a small number of items are extremely low rank sums observed. Among the rakingprocedures, there is only a slight difference. The effect of the head--nonhead control on estimation can be measured by comparing test results for arrays IP and 2P, since their only difference is whether or not the head~non-head control is present in the array. The comparison indicates that the head--non-head control reduces the SMSE somewhat. Note that single cell (SC) is considered a cell-by-cell procedure for the convenience of presentation. ii) Findings by Item Groups Only the cell-by-cell procedures except for 'col' are generally identified, more for their high ranks than for their low ranks in all four item groups. There is some indication that array IP is good for estimating income items. Conversely, matrix 3AP appears to do not as well for these items. 4.1.2 Based on Standard Error i) General Findings The test results (Table 3 of Appendix 4) based on standard errors are very similar to those for SMSE. In general, however, the test results are slightly more favorable to array IP (conversely for cell-by-cell techniques) when comparisons are made on standard error. The upper and lower critical regions for Thompson-Willke test are of the same size, hence the distribution of the frequencies of +'s and -'s for each estimation method is expected to be symmetric. However, those for array IP and for cellby-cell procedure based on age-sex-race-origin and
the other cell-by-cell procedures are asymmetric, i.e., those for the first are heavy to the minus (-) side and those for the latter are have to the plus (+) side. ii) Findings by Item Groups Only the cell-by-cell procedures except for 'col' have noticeably more +'s, with minor exceptions, than -'s in all four item groups. Array IP shows its superiority over the others for estimating income items and its slight edge over the others for estimating poverty items. Again, array IP with head--non-head control performs better than matrix 2P without the head--non-head control in the income item group. Array 3AP, despite its built-in economic control, trails the other raking procedures in estimating income items. 4.1.3 Based on Absolute Bias i) General Findings As shown in Table 4 of Appendix 4, the test results on absolute bias show the opposite trends to those based on SMSE and standard error. Now, only the cell-by-cell procedures have at least 5 -'s. However, all cell-by-cell procedures except for 'col' have enough +'s to suggest some less desirable aspects. Concerning the weaker side of those procedures, readers are directed to Table 5 of Appendix 4, which is only for state 76 where respondents were undersampled. Note the numbers of +'s for this state almost match those for three states combined. In state 76, undersampling biases were introduced for the categories incorporated in 'col' and the raking procedures as controls, This suggests that cell-by~cell techniques can results in more highly biased estimates in the presence of undersampling errors, and that controlling for undersampling can be beneficial. For the total +'s and -'s over 3 states, there is an interesting pattern concerning the number of high ranks for arrays IP and 2P. As the collapsing criterion number increases, the number of +'s decreases. In other words, the more the array is collapsed, the less frequently the extremely high rank sums occur. ii) Findings by Item Groups (3 States) Arrays IP and 2P produce the most highly biased estimates and this trend continues over almost all four item groups. Except in the labor force item group, 'col' often has low mean ranks. The single cell method and the rows of array 5P have more extremely low ranks than b ~ m e r e chance in all four item groups and their number of -'s overwhelms that of +'s. iii) Findings by Item Group in State 76 Only the cell-by-cell procedures have any significant number of +'s and -'s. Among them only 'col' is tagged "-" in two item groups: poverty and other. The other cell-by-cell techniques have mostly only +'s. In short, when undersampling problems were not encountered, the cell-by-cell techniques, especially SC, CSR a~d 'col' produce the least biased estimates while arrays IP and 2P produce the most biased. Also collapsing criteria 1 and 2 of arrays IP, 2P and 3AP are the least preferred in the sense that they have the most +'s among all six collapsing criteria considered. However, in the presence of undersampling bias, the cell-by-cell techniques except for 'col' are no longer preferred, as they are worse than the raking procedures.
171
Furthermore, the collapsing criteria 1 and 2 are no longer worse than the other criteria. 4.2 Results for Quantitative Measurement of Differences 4.2.1 Based on the Standard Error As shown in Tables 6 and 7 of Appendix 5, when a decision is made based on the composite index, the cell-by-cell procedures except for 'col' are inferior to the others. This inferiority is more apparent when the comparison is on the basis of the maximum. These high indices appear to result mostly from the high indices obtained in the labor force data item category. Weighting arrays, along with the cell-by-cell procedure 'col' have similar composite indices. The index for array IP is slightly lower than those for the others. It should be noted that only for the Family income item category does array IP outperform other methods for both statistics (i.e., median and maximum) considered. 4.2.2 Based on the SMSE Comparison of Table 6 with Table 8 of Appendix 5 indicates that the indices for medians based on the SMSE are almost identical to those based on the standard error. Comparing Tables 7 and 9, however, brings to light the differences between the indices for the maximum based on the SMSE and those based on the standard error for single cell and cell-by-cell procedures except for 'col'. For those procedures, the former is much higher than the latter mainly due to the high maximum biases for state 75. It is interesting to note that those cell-by-cell procedures underperform the others in all concerned item categories and are worst for the labor force item category. The weighting arrays and 'col' have similar composite indices. However, when a decision is made on the basis of the indices for the maxima, array IP is slightly preferable. Again, for only the family income item category, array IP does better irrespective of which statistics is used. V. CONCLUDING REMARKS The following general conclusions emerge from this study: (a) On the basis of the standard error and SMSE, the four weighting arrays (i.e,, raking procedures) plus the cell-by-cell procedure based on age-sex-origin are better than the other cell-bycell procedures and they are roughly equivalent with the choice possibly being weighting array IP due to its performance for income estimates; (b) On the basis of absolute bias, the cellby-cell procedures perform better when no undersampling biases are present. This was expected, since theoretically these procedures are essentially unbiased under "perfect" sampling. However, in the presence of some undersampling bias the methods which incorporated the undersampling categories as controls (i.e., the raking and 'col' procedures) performed better than the cell-by-cell methods which did not. Thus, the potential advantage of the cell-by-cell (unbiasedness) over the raking procedures disappears in the sample bias situation. It may be argued that a raking procedure which controls to some extent for sample biases, and also provides adequate estimates of various demographic totals would be desirable. Based on the analysis of sampling biases in the 1970 census, it has been previously recommended that some control should be instituted for the categories that make up the rows of array IP (5).
(c) As was observed in section 4.1.1 and 4.2.2, array IP produces somewhat better results for family data items than array 2P. It is hypothesized that these differences are due to the head--non-head control that array IP employs and array 2P doesn not; and (d) As was seen in section 4.1.3, all the weighting arrays seemed to show a decrease in absolute bias as the collapsing criteria become less strict. This difference was particularily apparent for the collapsing criteria that incorporated a minimum of 5. However, in arriving at any collapsing criterion it is necessary to weigh any increase in bias against a potential decrease in variance and perhaps in the toal MSE. All raking ratio estimation procedures and 'col' are almost evenly matched. However, array IP has some superiority for family items. Array IP also incorporates row control categories in which it is anticipated that undersampling biases will be present in the 1980 census sample. Thus it should perform better than the other estimation procedures considered, if such biases are in fact present in the 1980 census sample. In short, array IP was selected for use in 1980 census sample estimation. FOOTNOTES I/ The four arrays are denoted by IP, 2P, 3AP and 5P and the cell-by-cell methods based on the rows of array 2P, 3AP and 5P are abbreviated by C2R, C3AR and C5R, respectively. The cell-bycell method based on the columns of the array is referred to as "column" or "age-sex-raceorgin control." The single cell method is denoted by SC. 2/ Method 1 may be chosen without loss of generality for the presentation. REFERENCES (1) Thompson, J.H., Miskura, S.M., Woltman, H.F., and Bounpane, P.A., "1980 Census Weighting and Variance Estimation Studies, Design and Methodology," presented at the annual American Statistical Association Meeting," 1981. (2) Halperin, M., Greenhouse, S.W., Cornfield, J. and Zalokar, J. (1955), "Tables of percentage points for Studentized Maximum Absolute Deviate in Normal Samples," Journal of the American Statistical Association 50, pp 185195. (3) Kim, J., "Comparisons of Weighting Methods Based on Thompson-Willke Test Approach for Population Characteristics," Census Bureau Memorandum, 1980. (4) Thompson, W.A. Jr., and Willke, T.A. (1963), "Extreme Rank Sum Test for Outliers," Biometrika 50, pp 375-383. (5) Vajs, S.M., "Sampling Rate Variability and 1980 Sample Weighting Controls," Census Bureau Memorandum, 1979. (6) Brackstone, G.J., and Rao, J.N.K. "An Investigation of Raking Ratio Estimators," to appear in Sankya.
172
Appendix 1. Methodologyused for Testing the Number of Extreme Mean Ranks (+'s or -'s) for a Weighting Procedure
When the Thompson-Willke test is used for comparing "I" estimation methods for an item at a significance level ~ = P(~), a question arises as to the probability for the mean rank for a specific estimation method to be erroneously flagged (type I error). Since mean ranks (Yi) are essentially independent, the Ai are independent which in turn implies that ~i are independent i = i , 2, . . . , I.
From the Bonferroni inequalities,
I (P(~I)) ~ P(~) ~ I (P(~I)) - ~ P ( A I
A2 )"
Since ~i and A2 are independent and P(AI) = P(A2i, upon substituting ~ for P(A)
I - / 12 - 2~I(I-l) ~ P(AI) ~ I(I + i) For ~ = .20 the following bounds for P(X1) 2/ were found:
Table I. Boundsfor P(XI)
I
Lower Bound
Upper .B. ound
30
.0067
.0075
31
.0065
.0072
36
.0056
.0062
.0053
.0058 .
For convenience, we use the approximation P(~I) = .01 which entails conservative tests on the nu.mber o f + ' s or -,s. Note that P(~I) = P(+ or - for a given item or test). I f independence among n Thompson-Willke tests is assumed, the number of +'s or -'s (denoted by X) over n tests is binomially distributed with parameters n and P(X1). Hencethe rejection region for a test on the number of +'s or -'s can be found by finding r such that
{0 X:r
<,
n x <.
-
Then X > r is the rejection region and r is a specific number of +'s or -'s for a methot over n items.
Appendix 2 Collapsing Criteria * for Weighting Arrays Formed From Study Universe
Collapsing Criterion ID
Minimum Sample Count Per Row o r Column 5
Maximum Ratio Universe Count to Inflated Sample Count not needed
not needed ....
10
2
not needed
not needed
* I n i t i a l l y 8 c o l l a p s i n g c r i t e r i a were compared. However, s t a t e 97 s t u d y r e s u l t s for collapsing criteria 3 and 6 were identical to collapsing criteria 1 and 4, respectively. Hence, the collapsing criteria 3 and 6 were dropped from further processing and hence from this study.
Appendix 3 Weighting Array 1P
Race, Origin, Sex and Age Household Head Type and Size Head of Family with Own Children under 18 1 Person HH
Non Black
Spanish
Male
i Female
0- 5- 14- 19- 25- 35- 45- 65+ Age 4 13 18 24 34 44 64 SSZ ZS f
.....
Non Spanish
Male Female
Age
Age
Black
Spanish
Male Female
Age
Age
Non Spanish
Male Female
Age
Age
2 Person HH
3 Person HH
4 Person HH
5 Person HH
6+Person HH i Head of Family Without Own Children under 18 1 Person HH 2 Person HH
Other Head
1 Person HH
GQ Persons H = Head N = Nonhead
ii
173
Array IP
Table 2. Summary Table of Thompson-Willke Tests (Based on SMSE) Over 3 States
I Poverty
Income
i-o,i oJ(21 items) .{36 items)
0
o
11 0 31 0I
o
oo
02
o
o
0I
o
o
2
I Labor Force [ Edu-Occ-Ind (24 i.tems) I etc (9~3 i t e m ) 1
o
L[I 0
1
0
0
0
2
0
0
0
I
0
0
0
0
0
o. ......
0
0
'Total {174 i t,emsI i +'s I -'s
o1 I 451
o
1
1
3
0
1
0
3
Poverty , I s I -'s
Income +'s -'s
Labor Force
+'s
-'s
Edu-Occ-lnd etc
+'s
-'s
..... Total +'s -'s
A r r a y 2P
12 231t 40 51 70 80
22 01
0o
o1
oo
21
10
0
0
0
0
10
I
0
0
0
O1
0
0
0
0
00
0
0
0
O
.........
1
1
3
0.
o
1
·0
o
--Poverty
+'s -'s
0
0
1
0
Az ray lAP
o
1
,0
0
1
o
0
0
Income
+'s -'s
20
10
3* o
2
0
3* o
10
Labor Force
+'
-'s
Edu-Occ-lnd etc
+'s
-'s
0
0
1
1
0
0
0
2
o
o
o
o
0
0
1
0
1
1
o
1
0
2t
1
0
i
Total
+'s -'s
3
1
2
2
3
I
3
0
51 2
2
2
Poverty Income LaborForce Edu-Occ-Ind
Array 5P
+'s -'s +'s -'s +'s
oo
o' o
0
00
00
0
o
o
o
o
o
oo
oo
o
oo
oI o
oo
o1
o
-'s +'s -'s
2
0 ~ 2 --
0
0i2
o
o 'o
o
oio
o
o
o
o
1n o
]- -'S -T- 2 0 0 1 1
.
Cell-by-Cell )rocedures
C2R C3AR CSR SC Col
Poverty
.
.
s -'s
1
I
7* I
2* 0 41 0
0
0
I n c o m e Labor Force
.
.
+'s -'s +'s -'s
~ ~1 111 21
*
9* I
0I 31 0
10' 0 111 2*
20
0
I
Edu-Occ-lnd . etc .
+'s
-'s
171
I
221
0
171
I
I~*
I
2
Total
+'s -'s
'
291
~1
44*
29*
2
331
3
2
3
-: Significant low rank sum at ~ = .20 +: Significant high rank sum at ~ = .20 *: Number of +'s or -'s exceeding that expected by, chance at ~'= .05 under the null hypothesis of no difference among weighting methods. (See Appendix 5).
Table 3. Summary Table of Thompson-Willke Tests (Based on Standard Error) Over 3 States
Poverty (21 items)
~+'s -'s
,
,
Income 136 .items) +'s -'s
o
2
...y.IPii o0 0i
0
0
0
I
o 4* o0 ~3.* 0I 0I
Labor Force (24 items) +'s -'s
Edu-Occ-lnd etc (92 item)
-i-'s
-'s
i Total (174 items) +'s I -'s
0
0
0
2
0
81
o1
oI
o0
of
o1
451
0
0
0
0
0
I
0
0
0
0
0
2
Poverty +'s -'s
Income +'s -'s
Labor Force ,.
+'s
-'s
Edu-Occ.-Ind , etc -'s
2 1 21 3* 0
0
0
1
2
....y 2 P
oI I°0
. . . . ~1 o0
0
0
0
00
0I
0
0
0
0
00
00
0
0
0
0
Total
+ ' s -'$
3
3
51 4
2
0
,3
0
0
1
0
0
P. . . . ty
I .....
+'s -'s
,, o
21
1
.... 3JP4 . . .
0
2t
1
1
o
o
i
+'s -'s ,o oo . 20 31 0 1o
Labor F.... -'s
Edu-Occ- Ind , etc
+'s
-'s
o
o
,
i
o
o
o
3
0
0
0
0
0
1
0
0
0
0
0
1
o
1
1
o
i
Total
+'s -'s
~
,
1
4
.......
2
3
4
2
2
1
.
P.... ty I. . . . Labor F. . . . Edu-((mc-Ind
;+'s - ' s +'s - ' s +'s - ' s ч's
~o
o
oo
o
~. o
~
~o o
oo
o
0
0
2
5PL 04
0
0 ....
0
50 0
00
0
0
0
0
._~ 0 0
00
0
0
0
0
oo
oo
o
o
o
o
.
.
.
.
.
Tot( I 4s -,;- 0 0 0 0 o
Poverty , +'s -'s
I n c o m e Labor Force . . . ., . . . +'s -'s +'s -'s
Edu-Occ-Ind
etc
i
+'s
-'s
Total , +'s -'s
C2R 1
1
0 2 10"
4*
121
Cell-by-Cell 'rocedures
C3AR 7* 0 C5R 21 0
SC 5' 0
Col
0
8* 0 10"
II
91
2* 191
I
15"
4* 0
91 :21 131
I0
0
I
0
..
3
23*
10*
2
44*
4
5t
271
71
3
31"
5
2
2 I3
I
-: Significant low rank sum at ~ = .20 +: Significant high rank sum at ~ = .20 Number of +'s and -'s exceeding that expected *: by chance at ~'= .05 Under the null hypothesis of no difference among weighting methods.
Table 4. Summary Table of Thompson-Willke Tests (Based on Absolute Bias) Over 3 States
Poverty (21 items)
+'s -'s
__
l
~. o 4t o , ~, o
s 2,
o
i7 2t
0
i8 0
0
Income (36 items) +'s -is
o. o
4t
o
~, o
51
o
0
0
0
0
Labor Force (24 items) +'s -'s
,,
o
4* o
o
o
o
o
0
0
0
0
Edu-OccI-nd etc (92 item)
+'s
-'s
I,.
o
71
o
~,
o
I
o
3
0
3
0
Total (174 i.tems) +'s -'s 191 0 111 0 8t o ~1 o o
A r r a y 2P
Poverty +'s -'s 5t 0 2* 0 3t o 1o oo
Income. Labor Force ~ Edu-Occ.-Ind
,
etc
+'s -'s +'s
-'s +'s
-'s
91 0 0 o oo oo
61 0 lO t o
2t
0
*0
o
o
o
o
o
3
o
o
o
3
o
Total"
ч's -'s
361
0
30"
0
14 t
o
1~ 1
o
3
0
Poverty +'s -'s
li~ ~~.. oo
A r r a y 3AP
0
0
1
o
·0
0
0
0
Income . -'s
Labor Force . +'s -'s
·.
Edu-Occ-lnd . etc
Total .
~'s
.':
+'s -'s
~, . . o . . ~..
0
0
1
0
o
o
0
0
1
0
0
0
o
o~ ..
0
0
0
o
0
1
0
2
L.
~, ,,~~. ~,
0
1
0
1
1
1
1
2
1
2
2
2
.
.............
Poverty Income LaborForce Edu-Occ-Ind
Total
.
.
.
.
.
+'s -'s +'s :-'s
-'s +'s
-'s
+'s
's
~o o oo oo oo ~~., oo ~, o,
A r r a y 5P 4 0 0
00
1
0
0
0
51 0
1
0
0
0
0
0
o
70
0
0
1
0
0
0
0
80 0
0 0 ,0
0
0
0
.
.
~
o,
1
0
o
1
0
1
0
2
·,
Ceil-by-Cell Procedures
C2R C3AR C5R SC Coi
Poverty +'s -'s
31
I
4m 21
I
101
21 14"
0
71
I n c o a l e Labor Force ч's -'s +'s -'s
31 0
31
121
22
41
15"
0 91 21
151
I 221 I
I~ I
0 101 0
Edu-Occ-lnd etc
Total
+'s
-'s +'s -'s
------
91
61 181 191
161 12m 261 311
2
321
5* 661
I01 59* 14. 110"
0
16 t
0
331
L
,.,
-: Significant low rank sum at ~ = .20 +: Significant high rank sum at ~ = .20 *: Numberof +'s or - ' s exceeding that expected by chance at a = .05 under the n u l l hypothesis of no difference among weighting methods.
174
T a b l e 5. S u m m a r y Table of T h o m p s o n - W i l l k e T e s t s (BaSed on A b s o l u t e Bias in S t a t e 76
Array IP
F Poverty I Income
Labor Force Edu-Occ-lnd
Total
......
I ~ o v ~ , - I Income La.b.o. r. . .F.o.r.c.e. E d u - O c c - I n d .
Total
oO!OoooIOOoIOOoOooOOtooOo, 0o0°1Oo'°o OoOIooooo ,o!oo oolooooloo | (7 items) I (12. it_ems)
i ! 1 1 "l"S -'S
+'S -'S
(8 i.tems) i+'S -'S
eLc ( 31. item)
1 "I"S
-'S
(58 items ) +'S '
"'s l
21 o I ° I
A r r a y 2P 41° I ° I 0 0
o
o
-'sl etc
I
ч'S ] -'S
oo
1 0 0
0
O0 '
0
51 o I o i o o
o
o
o
o
0
0
o
o
o
o
o,o
o
o
:71° I ° I o o
o
o
o Io : 0
0
oo
oo
o. o
0
o
o:
.
.
.
.
.
.
.
8i.1..0.. i 0 I ° o,. 0
0
0
0 i._ 0
0
I Poverty . I
Inc.ome . Lab.or Forc.e ........
E.du-Occ.-lnd . Tota. l
etc
.
.
.
.
Poverty Income lLabor Force Edu-Occ-ind
........
.
Total
~ r r a y 3AP 4 ~r. i
.
.
0 O'
0
0
00 00
00 00 .
0" 0
0
0
0
0
0
0
0
0
0
0
.
0
0
I
0
0
I
0
0
1
0
0
0
0
0
0
0
0
0
I
I
I
"'I
I
I
I
I
0
1
0
1
0
1
0
0
0
0
O. 0
~ n a ) 5P
00
0
0
00 00
00 O0
00
0
0
00 0 '0
0 L1 0 !0
"0 " 0
0
0
O
0
0
0
o
o
0
0
0 -0
0
0
0
0
0
0
o
o
0
0
I
I
0 - - - - -l~ 0 0
0
0
~0 0
0
0
o
1
· .,
Cell-by-Cell Procedures
i C2R C3AR C5R SC Col
Poverty +'S -'S
]I n c o m e Labor Force +'S -'S +'$ -'S
Edu-Occ-lnd etc
+'s
-'s
Total +'~-'s
2* 0
2* 0
3*
0
8*
0
15" l o
4* 0
2* I
4*
0
14"
0
24* [ I
I
0
00
2*
0
2*
0
5* ] 0
2*
0
I0
I
0
10"
0
14' l 0
0
2*
00
0
i 0
0
2*
0 _I 4*
-: Significant low rank sum at ~ : .20 +~ Significant high rank sum at ~ : .20 *: Number of +'s or -'s exceeding that expected by chance at a : .05 under the null hypothesis of no difference among weighting methods.
31P Array 2P AP k. Sp ~ C2R CelI_~C3AR by- ~ C S R Cell L SC Col
Table 6 - Indices for Medians Based on Standard Error
Poverty Poverty Family Labor Occupation Education Composite
(Family) (Persons) Income Force
Index
I00 105.1 99.9 100.5
I00 100.9 104.3 100.3
I00 I00 I04 100.2 103.8 101.7 102.2 100.9
I00 100.6 I02.1 I01.i
I00 100.1 98.4 98.5
i00 102.1 101.7 100.6
102 101.2 101.4 99.2 100.9
102.0 114.4 105.9 112.5 105.3
102.4 122.1
104.3 121.8
I00.0 125.6
I04.9 122.7
106
98.3
100.3 103.1 100.3 100.9 100.4
106.5 105.3 106.1 106.9 99.1
105.9 113.4 106.6 107.9 101.7
IP lArray 2P 3AP \ 5P
Table 7. Indices for Maxima Based on Standard Error
Poverty Poverty Family Labor Occupation Education Composite
(Family) (Persons) Income Force
Index
i00 101.8 99.8 99.0
I00 99.5 99.7 105.2
I00 104.7 107.1 102.6
I00 100.9 100.8 I00.5
I00 99.9 101.3 100.6
I00 I00 102.9 I00
i00 101.1 I01.9 101.3
r C2R CelI_JC3AR by- ~ C S R Cell L SC Col
103.5 100.2 100.9 101.6 99.9
103.3 106.6 106.5 108.2 99.9
101.7 116.6 103.5 113.1 105.2
211.6 154.7 156.6 153.6 I01
99.5 101.9 100.3 102.6 101.7
108.4 115.9 107.8 111.5 I01
121.3 116.0 112.6 115.1 I01.5
Table 8. Indices f o r Medians Based on SMSE
Poverty
Poverty Family Labor Occupation
(Family) (Persons) Income Force
31P Array 2P AP k. Sp ~ C2R Cell_IC3AR by- ~CSR Cell J SC L Col
i00 102.5 99.9 100.2 101.3 100.8 100.6 100.5 102.7
I00 100.7 102.9 98.6 104.6 118.1 106.2 114.8 103.2
100 105.4 104.0 102.5
i00 100.7 101.1 100.5
102.7 107.6 100.8 107.0 105.6
121.5 119.9 119.3 122.9 98.2
100 100.3 I01 100.5 99.0 103.0 99.5 101.4 99.5
Education Composite Index
I00 99.9 98.5 98.2
I00 102.5 I01.7 iO0.3
104.7 104.1 103.9 105.3 98.6
105.6 108.9 105.1 108.7 101.5
Table 9. Indices for Maxima Based on SMSE
Poverty
Poverty Family Labor Occupation Education Composite
(Family) (Persons) Income Force
Index
31P Array 2P AP k. Sp ~ C2R CelI_JC3AR by- ~ C 5 R Cell L SC Col
100 102.3 102.7 101.4 118.8 117.6 104.4 .121.9 101.6
I00 99.6 98.4 105.6 llS.l 132.1 121.8 128.9 98.0
I00 106.7 I07.4 I08.6
I00 100.8 100.2 99.9
182.6 186.7 113.9 180.8 I09.3
237.5 230.4 221.9 238.7 I01.8
i00 I00 101.4 100.6 117.9 116.4 ll0.9 I17.5 101.7
I00 100.2 I02.8 I00 123.5 133.0 118.6 127.6 101.3
I00 I01.6 102.2 102.7 149.2 152.7 131.9 152.6 102.3
175

File: empirical-results-from-the-1980-census-sample-estimation-study.pdf
Title: 1981
Published: Fri Jul 5 14:34:20 2002
Pages: 6
File size: 0.52 Mb


Systems thinking, 4 pages, 0.91 Mb

, pages, 0 Mb
Copyright © 2018 doc.uments.com