ASHA journals
S1_JSLHR-21-00473Dalton.pdf (506.39 kB)

Validating automated CoreLex calculations (Dalton et al., 2022)

Download (506.39 kB)
online resource
posted on 2022-08-02, 19:10 authored by Sarah Grace Dalton, Brielle C. Stark, Davida Fromm, Kristen Apple, Brian MacWhinney, Amanda Rensch, Madyson Rowedder

Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts.

Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five structured monologic discourse tasks were scored manually by trained scorers and via automation using a newly developed CLAN command based upon previously published lists for CoreLex. Point-to-point (or word-by-word) accuracy and reliability of the two methods were calculated. Scoring discrepancies were examined to identify errors. Time estimates for each method were calculated to determine if automated scoring improved efficiency.

Results: Intraclass correlation coefficients for the tasks ranged from .998 to .978, indicating excellent intermethod reliability. Automated scoring using CLAN represented a significant time savings for an experienced CLAN user and for inexperienced CLAN users following step-by-step instructions.

Conclusions: Automated scoring of CoreLex is a valid and reliable alternative to the current gold standard of manually scoring CoreLex from transcribed monologic discourse samples. The downstream time saving of this automated analysis may allow for more efficient and broader utilization of this discourse measure in aphasia research. To further encourage the use of this method, go to for materials and the step-by-step instructions utilized in this project.

Supplemental Material S1. Step-by-step instructions for using the “corelex” CLAN command provided to inexperienced CLAN users for testing. 

Dalton, S. G., Stark, B. C., Fromm, D., Apple, K., MacWhinney, B., Rensch, A., & Rowedder, M. (2022). Validation of an automated procedure for calculating core lexicon from transcripts. Journal of Speech, Language, and Hearing Research. Advance online publication.


This work was supported in part by the National Institute on Deafness and Other Communication Disorders 3R01-DC008524 (2007–2022, awarded to B. MacWhinney).