How accurate is it?
The primary purpose of the estimated model is to reproduce the statistical properties of the original census data. Therefore the statistical model should reproduce the empirical frequencies of different properties as precisely as possible. In order to verify the model accuracy, we have compared the empirical frequencies of different statistically relevant combinations of responses with the estimates derived from the statistical model. We have found that the accuracy of the 10% microdata subset is only marginally better than the statistical model at reproducing the empirical frequencies.Mean relative error according to subpopulation size
Interval | Lower bound | Upper bound | Number of combinations | Model (relative error in %) | Microdata (relative error in %) |
---|---|---|---|---|---|
1 | 1612 | 3000 | 7688027 | 6.10 | 5.16 |
2 | 3000 | 5000 | 5011625 | 4.88 | 3.86 |
3 | 5000 | 7500 | 3220931 | 4.04 | 3.07 |
4 | 7500 | 10000 | 1906156 | 3.50 | 2.58 |
5 | 10000 | 15000 | 2213787 | 3.04 | 2.17 |
6 | 15000 | 30000 | 2695817 | 2.38 | 1.67 |
7 | 30000 | 50000 | 1296118 | 1.80 | 1.23 |
8 | 50000 | 100000 | 1075615 | 1.37 | 0.94 |
9 | 100000 | 150000 | 372570 | 1.03 | 0.70 |
10 | 150000 | 300000 | 358112 | 0.78 | 0.55 |
11 | 300000 | 500000 | 125103 | 0.55 | 0 39 |
12 | 500000 | 1000000 | 71104 | 0.39 | 0.28 |
13 | 1000000 | 1500000 | 15324 | 0.29 | 0.20 |
14 | 1500000 | 3000000 | 8511 | 0.22 | 0.14 |
15 | 3000000 | 5000000 | 1349 | 0.12 | 0.08 |
16 | 5000000 | 10300000 | 200 | 0.02 | 0.04 |
Distribution of relative errors of estimates according to the empirical frequency N(xC) (sub-population size). Comparison of the statistical model and 10%-subset of microdata. In the First two columns we specify the lower and upper bounds of the frequency intervals, respectively. The third column contains the number of properties falling into the given interval of empirical frequencies. The last two columns contain the corresponding mean relative errors for the statistical model and subset of microdata, respectively.
Details are described in the Paper