Abstract
This thesis is focused on describing and testing a computationally light vowel synthesis model, which can be used to generate glottal flow pulses for more sophisticated acoustic simulators of the vocal tract. The core of the model consists of a low-order mass-spring system that represents the vocal folds, Bernoulli flow with viscous pressure loss in the glottis, and a Webster resonator that represent the vocal tract. The Webster resonator makes use of centreline and area function data which have been extracted from magnetic resonance images. With the aim of producing a minimal model, new elements are added to the model one by one, and the impact of the added complexity is investigated. These additions include dissipation along the vocal tract, a horn-shaped Webster resonator to represent the subglottal tract, and losses caused by turbulence in the glottis. In addition, technical changes are also introduced which allow the model to be used with any vocal tract geometry and in a large number of simulations. For such model to be of practical use, it must be able to produce glottal flow with a variety of fundamental frequencies and phonation types. This tunability is achieved by optimising four selected parameters. Solving the multi-objective optimisation problem directly is not practical due to the complicated dynamic behaviour of the model and long computing time of each simulation. Instead, a three-step procedure combining constrained single-objective optimisation, parameter space exploration, and manual pulse shape selection is introduced. Three well-known direct search optimisation algorithms, pattern search, simulated annealing, and genetic algorithm, are tested for the optimisation step. A pattern searchbased algorithm is developed for pathwise parameter space exploration. Finally, the use of the closing quotient, a pulse shape parameter, as an aid for the final selection is tested.
Original language | English |
---|---|
Qualification | Licentiate's degree |
Awarding Institution |
|
Supervisors/Advisors |
|
Publisher | |
Publication status | Published - 2014 |
MoE publication type | G3 Licentiate thesis |
Keywords
- Speech production
- Glottal pulse generator
- Glottal flow
- Mechano-acoustic model
- Parameter tuning