How accurate is it?

The primary purpose of the estimated model is to reproduce the statistical properties of the original census data. Therefore the statistical model should reproduce the empirical frequencies of different properties as precisely as possible. In order to verify the model accuracy, we have compared the empirical frequencies of different statistically relevant combinations of responses with the estimates derived from the statistical model. We have found that the accuracy of the 10% microdata subset is only marginally better than the statistical model at reproducing the empirical frequencies.

Mean relative error according to subpopulation size

Interval	Lower bound	Upper bound	Number of combinations	Model (relative error in %)	Microdata (relative error in %)
1	1612	3000	7688027	6.10	5.16
2	3000	5000	5011625	4.88	3.86
3	5000	7500	3220931	4.04	3.07
4	7500	10000	1906156	3.50	2.58
5	10000	15000	2213787	3.04	2.17
6	15000	30000	2695817	2.38	1.67
7	30000	50000	1296118	1.80	1.23
8	50000	100000	1075615	1.37	0.94
9	100000	150000	372570	1.03	0.70
10	150000	300000	358112	0.78	0.55
11	300000	500000	125103	0.55	0 39
12	500000	1000000	71104	0.39	0.28
13	1000000	1500000	15324	0.29	0.20
14	1500000	3000000	8511	0.22	0.14
15	3000000	5000000	1349	0.12	0.08
16	5000000	10300000	200	0.02	0.04

Distribution of relative errors of estimates according to the empirical frequency N(xC) (sub-population size). Comparison of the statistical model and 10%-subset of microdata. In the First two columns we specify the lower and upper bounds of the frequency intervals, respectively. The third column contains the number of properties falling into the given interval of empirical frequencies. The last two columns contain the corresponding mean relative errors for the statistical model and subset of microdata, respectively.

Details are described in the Paper

Presentation of Census Results by Interactive Statistical Models

How accurate is it?

Mean relative error according to subpopulation size