An approach to explaining formants (Story, 2024)
Purpose: This tutorial is a description of a possible approach to teaching the concept of formants to students in a speech science course, at either the undergraduate or graduate level. The approach is to explain formants as prominent regions of energy in the output spectrum envelope radiated at the lips, and how they arise as the superposition of vocal tract resonances on a source signal. Standing waves associated with vocal tract resonances are briefly explained and standing wave animations are provided. Animations of the temporal variation of the vocal tract, vocal tract resonances, spectra, and spectrograms, along with audio samples are included to provide dynamic demonstrations of the concept of formants.
Conclusions: The explanations, accompanying demonstrations, and suggested activities are intended to provide a launching point for understanding formants and how they can be measured, analyzed, and interpreted. As a result, participants should be able to describe the meaning of the term “formant” as it relates to a spectrum and a spectrogram, explain the difference between formants and vocal tract resonances, explain how vocal tract resonances combined with the voice source generate formants, and identify formants in both narrow-band and wide-band spectrograms and track their time-varying patterns with a formant tracking algorithm.
Supplemental Material S1. Standing wave in neutral vocal tract configuration for the first resonance.
Supplemental Material S2. Standing wave in neutral vocal tract configuration for the second resonance.
Supplemental Material S3. Standing wave in neutral vocal tract configuration for the third resonance.
Supplemental Material S4. Pressure distribution in neutral vocal tract configuration at 1000 Hz, off resonance.
Supplemental Material S5. Animation of the temporal variation of the components of the source-filter representation during production of “Hello, how are you.” The animation also includes an audio track that is a slowed version of the phrase generated by the TubeTalker model.
Supplemental Material S6. Audio file containing the real-time voice source signal (glottal flow wave) generated during the TubeTalker simulation of “Hello, how are you.”
Supplemental Material S7. Audio file containing the real-time output pressure signal generated during the TubeTalker simulation of “Hello, how are you.”
Supplemental Material S8. Animation of the temporal variation of the vocal tract in two representations during production of “Hello, how are you.” In the upper inset plot, the vocal tract is shown in tubular form, and in the main plot in the middle the vocal tract is shown in a pseudo-midsagittal form. The lower inset plot shows the simultaneous temporal variation of the frequency response function (resonances). The animation also includes an audio track that is a slowed version of the phrase generated by the TubeTalker model.
Supplemental Material S9. Animation of the temporal variation of the frequency response function in three-dimensions (time, frequency, amplitude) during production of “Hello, how are you.” There is a delay in middle of the animation to allow the viewer to see the full history and then the view rotates into a traditional spectrographic perspective. The animation also includes an audio track that is a slowed version of the phrase generated by the TubeTalker model.
Supplemental Material S10. Animation of the temporal variation of narrow-band spectra in three-dimensions (time, frequency, amplitude) during production of “Hello, how are you.” There is a delay in middle of the animation to allow the viewer to see the full history and then the view rotates into a traditional spectrographic perspective. The animation also includes an audio track that is a slowed version of the phrase generated by the TubeTalker model.
Story, B. H. (2024). An approach to explaining formants. Perspectives of the ASHA Special Interest Groups, 9(2), 461–471. https://doi.org/10.1044/2023_PERSP-23-00200