ParAlg: A paraphasia algorithm (Casilio et al., 2023)
Purpose: A preliminary version of a paraphasia classification algorithm (henceforth called ParAlg) has previously been shown to be a viable method for coding picture naming errors. The purpose of this study is to present an updated version of ParAlg, which uses multinomial classification, and comprehensively evaluate its performance when using two different forms of transcribed input.
Method: A subset of 11,999 archival responses produced on the Philadelphia Naming Test were classified into six cardinal paraphasia types using ParAlg under two transcription configurations: (a) using phonemic transcriptions for responses exclusively (phonemic-only) and (b) using phonemic transcriptions for nonlexical responses and orthographic transcriptions for lexical responses (orthographic-lexical). Agreement was quantified by comparing ParAlg-generated paraphasia codes between configurations and relative to human-annotated codes using four metrics (positive predictive value, sensitivity, specificity, and F1 score). An item-level qualitative analysis of misclassifications under the best performing configuration was also completed to identify the source and nature of coding discrepancies.
Results: Agreement between ParAlg-generated and human-annotated codes was high, although the orthographic-lexical configuration outperformed phonemic-only (weighted-average F1 scores of .78 and .87, respectively). A qualitative analysis of the orthographic-lexical configuration revealed a mix of human- and ParAlg-related misclassifications, the former of which were related primarily to phonological similarity judgments whereas the latter were due to semantic similarity assignment.
Conclusions: ParAlg is an accurate and efficient alternative to manual scoring of paraphasias, particularly when lexical responses are orthographically transcribed. With further development, it has the potential to be a useful software application for anomia assessment.
Supplemental Material S1. MAPPD-12K dataset extraction, preparation, and formatting. This document outlines the process of curating the MAPPD-12K dataset used in the current study from the larger archival database (Mirman et al., 2010).
Supplemental Material S2. MAPPD-12K dataset. This CSV file contains the MAPPD-12K dataset, which was used for the analyses of the current study. Extraction, preparation, and formatting procedures are outlined in Supplemental Material S1.
Supplemental Material S3. ParAlg output and item-level analyses. This CSV file contains abbreviated ParAlg output of MAPPD-12K under both transcription configurations from the current study, including (a) production identifiers for each response in the MAPPD-12K dataset; (b) unique target-response pair identifiers, as described in the Appendix; (c) ParAlg-generated paraphasia codes under the two transcription conditions; (d) human-annotated paraphasia codes. Dichotomous coding of discrepancies for each transcription configuration are also included, as well categorization and subcategorization of phonological and semantic similarity discrepancies from the three raters. Data that consisted of dichotomous “yes” and “no” responses (e.g., semantically related or unrelated to the target) was dummy-coded as 1 and 0 respectively. Nominal response data (e.g., type of semantic relationship) was coded using a single word or phrase that generally aligned with usage in the analyses of the manuscript (e.g., “coordinate” was used to denote the Category coordinate subcategory of the semantic similarity discrepancy analysis).
Casilio, M., Fergadiotis, G., Salem, A. C., Gale, R. C., McKinney-Bock, K., & Bedrick, S. (2023). ParAlg: A paraphasia algorithm for multinomial classification of picture naming errors. Journal of Speech, Language, and Hearing Research, 66(3), 966–986. https://doi.org/10.1044/2022_JSLHR-22-00255