Empirical studies in the economics of education, the measurement of skill gaps across demographic groups, and the impacts of interventions on skill formation rely on psychometrically validated test scores that record the proportion of items correctly answered. Test scores are sometimes taken as measures of an invariant scale of human capital that can be compared over time and people. We show that for a prototypical test, invariance is violated. We use an unusually rich data set from an early childhood intervention program that measures knowledge of narrowly defined skills on essentially equivalent subsets of tasks. We examine if conventional, broadly-defined measures of skill are the same across people who are comparable on detailed knowledge measures. We reject the hypothesis of aggregate scale invariance and call into question the uncritical use of test scores in research on education and on skill formation. We compare different measures of skill and ability and reject the hypothesis of valid aggregate measures of skill.

More on this topic

BFI Working Paper·Feb 12, 2025

Boosting Young Children’s Math Skill with Technology in the Home Environment

Daniela Bresciani Andaluz, Ariel Kalil, Haoxuan Liu, Susan E. Mayer, and Rohen Shah
Topics: Early Childhood Education
BFI Working Paper·Feb 12, 2025

A Digital Library for Parent-Child Shared Reading Improves Literacy Skills for Young Disadvantaged Children

Ariel Kalil, Haoxuan Liu, Susan Mayer, Derek Rury, and Rohen Shah
Topics: Early Childhood Education
BFI Working Paper·Feb 12, 2025

Priming Parental Identity: Evidence from Experimental Data

Daniela Bresciani, Ariel Kalil, Haoxuan Liu, and Susan Mayer
Topics: Early Childhood Education