Pre-processing input text: improving pronunciation for the fluent Dutch text-to-speech synthesizer

To improve pronunciation of the Fluent Dutch Text-To-Speech Synthesiser, two pre-processors were built that try to detect problematic cases in input texts and solve these automatically if possible. One pre-processor examines the pronounceability of surnames and company names by checking whether their initial and final two-letter combinations can be handled by the grapheme-to-phoneme rules of the Fluency TTS system, and correcting those automatically when and if possible. Also, common disambiguous abbreviations are properly expanded. The second pre-processor tries to realise pronounceable forms... Mehr ...

Verfasser: R. Jansen
A.J. van Hessen
L.C.W. Pols
Dokumenttyp: conference contribution
Erscheinungsdatum: 1998
Verlag/Hrsg.: Institute of Phonetic Sciences
University of Amsterdam
Amsterdam
Sprache: unknown
Permalink: https://search.fid-benelux.de/Record/base-27448850
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : http://hdl.handle.net/11245/1.423852

To improve pronunciation of the Fluent Dutch Text-To-Speech Synthesiser, two pre-processors were built that try to detect problematic cases in input texts and solve these automatically if possible. One pre-processor examines the pronounceability of surnames and company names by checking whether their initial and final two-letter combinations can be handled by the grapheme-to-phoneme rules of the Fluency TTS system, and correcting those automatically when and if possible. Also, common disambiguous abbreviations are properly expanded. The second pre-processor tries to realise pronounceable forms for numbers that do not have a straightforward pronunciation. Structural and contextual information is used in an attempt to determine to what category a number belongs, and each number is expanded according to the pronunciation conventions of its category. It can be said that these pre-processors are a useful aid in offline pronounceability examination (for names) and improvement of performance at run-time (for numbers), although ambiguity and redundancy in the input text illustrate the need for semantic and syntactic parsing to approach human text interpretation skills.