Data Quality and Fiscal Allocation in South Africa
Kerr, A., Thornton, A., and Barnard, C. (2026) The Local Government Equitable Share Formula and poverty measurement in South Africa. Policy brief and related research commissioned by the Equality Collective (NGO).
This research is being prepared for the Equality Collective NGO to submit to a review of the white paper on the local government equitable share formulae by National Treasury. We can show that income poverty as measured in census 2011 is subtiantially higher and - more importantly - distributed differently across provinces than that measured using the same threshold in the 2010/11 IES or 2014/5 LCS. This is partly related to the census having a less sensitive instrument to capture household income (questions, brackets) than the IES and LCS but also to measurement error and choice of poverty threshold.
Employment Measurement in South Africa
Kerr, A., Thornton, A., and Wachira, P. (2025) How many people are employed in South Africa? Forthcoming SA-TIED working paper.
That so many people lack employment is one of South Africa’s biggest problems. The main labour market survey in South Africa that is used by economists and policy makers is the Quarterly Labour Force Survey (QLFS), conducted by Statistics South Africa. But other surveys also measure employment, including the General Household Survey (GHS), also conducted by Statistics South Africa, and the National Income Dynamics Study (NIDS), conducted by SALDRU with funding from the South African presidency. In this paper we compare employment estimates across these three surveys. We show that despite very similar or even identical sample designs in some years, and identical questions about employment in many years, the GHS produces higher estimates of total employment in South Africa than the QLFS from around 2012, and this difference is even larger post-covid. We investigate some possible explanations for this result, finding that raw sample composition and weights do not contribute substantially to these differences.
Data quality of Census 2022
Budlender J. and Thornton, A. (2026) Irregular imputation and implausible households in the 2022 South African census. Forthcoming SALDRU working paper.
We uncover and describe major data quality problems in South Africa’s 2022 Census microdata.
Data quality of Census 2022
Thornton A. (2026) Household information in South African census and survey data: household change, Covid-19 and Census 2022. Forthcoming DataFirst Technical Paper.
Recently, other authors have raised quality concerns regarding Census 2022 with respect to the population count, age distribution, and unusual imputation. In this presentation, I try to focus on the quality of the household information. This is a harder task than assessing person demographics because there is no equivalent of vital statistics or demographic identities that demographers have at their disposal. Instead, I try to investigate the integrity of relatively stable microdata relationships in the household headship model. This model is of particular relevance because of the direct connection between the headship rate and household counts which are used to allocate government resources and calibrate household survey data. Headship rates, including those from Census 2022, are also the basis of StatsSA’s forthcoming Mid-Year Household Estimates Series. I show strong agreement between census and survey data pre-2022 regarding the headship model but divergence between the survey data and Census 2022. Specifically, the Covid-19 lockdowns are clearly detectable in the survey data but by 2022 headship returns to its pre-Covid patterns. By contrast, Census 2022 represents a path-break both with the survey data and with the pre-Covid census data. To what extent is Census 2022 different because of measurement error and to what extent is it different because of Covid-19? What does it mean if the survey and census data differ, if anything? What does this mean for using Census 2022 as a benchmark for household counts beyond 2022? These are questions I consider along with showing some suggestive evidence of measurement error in the marital status distribution.