Evaluation of singing synthesis: Methodology and case study with concatenative and performative systems
Feugère, Lionel ORCID: https://orcid.org/0000-0003-0883-5224, d'Alessandro, Christophe, Delalez, Samuel, Ardaillon, Luc and Roebel, Axel (2016) Evaluation of singing synthesis: Methodology and case study with concatenative and performative systems. In: Proceedings Interspeech 2016. International Speech Communication Association, San Francisco, pp. 1245-1249. (doi:10.21437/Interspeech.2016-1248)
PDF (Publisher's PDF)
23556 FEUGERE_Evaluation_of_Singing_Synthesis_2016.PDF - Published Version Restricted to Registered users only Download (227kB) | Request a copy |
Abstract
The special session Singing Synthesis Challenge: Fill-In the Gap aims at comparative evaluation of singing synthesis systems. The task is to synthesize a new couplet for two popular songs. This paper address the methodology needed for quality assessment of singing synthesis systems and reports on a case study using 2 systems with a total of 6 different configurations. The two synthesis systems are: a concatenative Text- to-Chant (TTC) system, including a parametric representation of the melodic curve; a Singing Instrument (SI), allowing for real-time interpretation of utterances made of flat-pitch natural voice or diphone concatenated voice. Absolute Category Rating (ACR) and Paired Comparison (PC) tests are used. Natural and natural-degraded reference conditions are used for calibration of the ACR test. The MOS obtained using ACR shows that the TTC (resp. the SI) ranks below natural voice but above (resp. in between) degraded conditions. Then singing synthesis quality is judged better than auto-tuned or distorted natural voice in some cases. PC results show that: 1/ signal processing is an important quality issue, making the difference between sys- tems; 2/ diphone concatenation degrades the quality compared to flat-pitch natural voice; 3/ Automatic melodic modelling is preferred to gestural control for off-line synthesis.
Item Type: | Conference Proceedings |
---|---|
Title of Proceedings: | Proceedings Interspeech 2016 |
Additional Information: | INTERSPEECH 2016 was held from September 8–12, 2016, San Francisco, USA. |
Uncontrolled Keywords: | singing synthesis, singing quality assessment, computer music |
Subjects: | M Music and Books on Music > MT Musical instruction and study |
Faculty / School / Research Centre / Research Group: | Faculty of Engineering & Science Faculty of Engineering & Science > Natural Resources Institute Faculty of Engineering & Science > Natural Resources Institute > Agriculture, Health & Environment Department |
Last Modified: | 21 Jul 2021 13:07 |
URI: | http://gala.gre.ac.uk/id/eprint/23556 |
Actions (login required)
View Item |
Downloads
Downloads per month over past year