Phonemic variability and confusability in pronunciation modeling for automatic speech recognition

This thesis addresses the problems of phonemic variability and confusability from the pronunciation modeling perspective for an automatic speech recognition (ASR) system. In particular, several research directions are investigated. First, automatic grapheme-to- phoneme (g2p) and phoneme-to-phoneme (p2p) converters are developed that generate alternative pronunciations for in-vocabulary as well as out-of-vocabulary (OOV) terms. Since the addition of alternative pronunciation may introduce homophones (or close homophones), there is an increase of the confusability of the system. A novel measure of this confusability is proposed to analyze it and study its relation with the ASR performance. This pronunciation confusability is higher if pronunciation probabilities are not provided and can potentially severely degrade the ASR performance. It should, thus, be taken into account during pronunciation generation. Discriminative training approaches are, then, investigated to train the weights of a phoneme confusion model that allows alternative ways of pronouncing a term counterbalancing the phonemic confusability problem. The objective function to optimize is chosen to correspond to the performance measure of the particular task. In this thesis, two tasks are investigated, the ASR task and the KeywordSpotting (KWS) task. For ASR, an objective that minimizes the phoneme error rate is adopted. For experiments conducted on KWS, the Figure of Merit (FOM), a KWS performance measure, is directly maximized.

Data and Resources

Additional Info

Field Value
Source https://theses.hal.science/tel-00843589
Author Karanasou, Panagiota
Maintainer CCSD
Last Updated May 10, 2026, 09:33 (UTC)
Created May 10, 2026, 09:33 (UTC)
Identifier NNT: 2013PA112087
Language en
Rights https://about.hal.science/hal-authorisation-v1/
contributor Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI) ; Université Paris-Sud - Paris 11 (UP11)-Sorbonne Université - UFR d'Ingénierie (UFR 919) ; Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris Saclay (COmUE)
creator Karanasou, Panagiota
date 2013-06-11T00:00:00
harvest_object_id 174e8af7-bf29-46e4-823a-59b5f598d785
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2026-03-31T00:00:00
set_spec type:THESE