Performance of Montreal Forced Aligner on Cantonese Spontaneous Speech

Poster presented at INTERSPEECH, Rotterdam, The Netherlands. 17-21 August 2025.

The study presents a comprehensive evaluation of the Montreal Forced Aligner (MFA) in aligning phone boundaries of Hong Kong Cantonese (HKC) spontaneous speech. We developed two tailored Cantonese MFA models, designed to address distinct Cantonese phonetic features, such as checked syllables. These models were applied to align the same set of recordings from spontaneous interviews, and their performance was compared against human annotations. Our results reveal that the updated Cantonese MFA models achieved decent alignment accuracy on spontaneous speech, with a satisfactory level of agreement with manually adjusted boundaries in vowels. However, Cantonese-specific features and connected speech process remain major challenges for the current models. This observation allows us to propose specific amendments to the models to improve alignment performance, as well as recommendations on manual boundary adjustments.

Dr Chenzi Xu
Dr Chenzi Xu
Leverhulme Early Career Fellow

My research interests include speech prosody, speech perception, and speech technology.

Previous

Related