Listen "Improving Historical Census Transcriptions: A Machine Learning Approach"
Episode Synopsis
This paper describes an effort to improve the accuracy of historical U.S. Census transcriptions using a machine learning model. The authors focused on correcting errors in name transcriptions from the 1940 census for Rhode Island, specifically targeting records where independent human transcriptions from Ancestry.com and FamilySearch.org disagreed. The improved transcriptions significantly increased the rate of linking individuals across census records, particularly benefiting records with low original legibility where human transcribers typically struggle. This approach promises to enhance the utility of historical census data for economic and social research by creating a higher quality, linked dataset across multiple periods.
More episodes of the podcast Marketing^AI
Generative Brand Choice
04/12/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.