Skip to main content

Department Seminar: Measuring Textual Sophistication in any Domain in any Language using Machine Learning

Seminar
POLS Seminar
Wednesday, February 23, 2022, 1:30 pm – 3:10 pm

Department Seminar, online event

Title: Measuring Textual Sophistication in any Domain in any Language using Machine Learning

Abstract: Measuring the readability or sophistication of language has a long tradition, but this tradition is rooted in fields of education and psychology rather than the domains to which it is frequently applied.  Rather than use out of context measures that may have little applicability to other domains, we show how to develop and test indexes of textual sophistication fitted to any domain, using a variety of lexical and linguistic markers combined with crowd-sourced judgments about textual difficulty.  Because the approach is general, it can be fitted to any domain.  We demonstrate this approach by measuring the sophistication of political communication, reanalysing the State of the Union corpus to demonstrate how conclusions about the decline of linguistic sophistication in politics differ when using our improved approach.