Staff employed at University of Copenhagen – University of Copenhagen

Forward this page to a friend Resize Print Bookmark and Share

CTR > People and Networks > Staff

Perception of Paralinguistic Traits in Synthesized Voices

Research output: Research - peer-reviewPaper

Standard

Perception of Paralinguistic Traits in Synthesized Voices. / Baird, Alice Emily; Hasse Jørgensen, Stina; Parada-Cabaleiro, Emilia; Hantke, Simone; Cummins, Nicholas; Schuller, Bjorn .

2017. Paper presented at Audio Mostly, London, United Kingdom.

Research output: Research - peer-reviewPaper

Harvard

Baird, AE, Hasse Jørgensen, S, Parada-Cabaleiro, E, Hantke, S, Cummins, N & Schuller, B 2017, 'Perception of Paralinguistic Traits in Synthesized Voices' Paper presented at Audio Mostly, London, United Kingdom, 23/08/2017 - 26/08/2017, .

APA

Baird, A. E., Hasse Jørgensen, S., Parada-Cabaleiro, E., Hantke, S., Cummins, N., & Schuller, B. (2017). Perception of Paralinguistic Traits in Synthesized Voices. Paper presented at Audio Mostly, London, United Kingdom.

Vancouver

Baird AE, Hasse Jørgensen S, Parada-Cabaleiro E, Hantke S, Cummins N, Schuller B. Perception of Paralinguistic Traits in Synthesized Voices. 2017. Paper presented at Audio Mostly, London, United Kingdom.

Author

Baird, Alice Emily ; Hasse Jørgensen, Stina ; Parada-Cabaleiro, Emilia ; Hantke, Simone ; Cummins, Nicholas ; Schuller, Bjorn . / Perception of Paralinguistic Traits in Synthesized Voices. Paper presented at Audio Mostly, London, United Kingdom.5 p.

Bibtex

@conference{1ea160e3864b4430a58ec0c70a425273,
title = "Perception of Paralinguistic Traits in Synthesized Voices",
abstract = "Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily–life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we prescribe to ourselves. When the recorded voice is synthesized, does our perception of its new machine embodiment change, and can we consider an alternative, more inclusive form? To begin evaluating the impact of aesthetic design, this study presents a first–step perception test to explore the paralinguistic traits of the synthesized voice. Using a corpus of 13 synthesized voices, constructed from acoustic concatenative speech synthesis, we assessed the response of 23 listeners from differing cultural backgrounds. Evaluating if the perception shifts from the known ground–truths, we asked listeners to assigned traits of age, gender, accent origin, and human–likeness. Results present a difference in perception for age and human–likeness across voices, and a general agreement across listeners for both gender and accent origin. Connections found between age, gender and human–likeness call for further exploration into a more participatory and inclusive synthesized vocal identity.",
keywords = "Faculty of Humanities, user studies, human-centered computing",
author = "Baird, {Alice Emily} and {Hasse Jørgensen}, Stina and Emilia Parada-Cabaleiro and Simone Hantke and Nicholas Cummins and Bjorn Schuller",
year = "2017",
month = "8",

}

RIS

TY - CONF

T1 - Perception of Paralinguistic Traits in Synthesized Voices

AU - Baird,Alice Emily

AU - Hasse Jørgensen,Stina

AU - Parada-Cabaleiro,Emilia

AU - Hantke,Simone

AU - Cummins,Nicholas

AU - Schuller,Bjorn

PY - 2017/8/25

Y1 - 2017/8/25

N2 - Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily–life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we prescribe to ourselves. When the recorded voice is synthesized, does our perception of its new machine embodiment change, and can we consider an alternative, more inclusive form? To begin evaluating the impact of aesthetic design, this study presents a first–step perception test to explore the paralinguistic traits of the synthesized voice. Using a corpus of 13 synthesized voices, constructed from acoustic concatenative speech synthesis, we assessed the response of 23 listeners from differing cultural backgrounds. Evaluating if the perception shifts from the known ground–truths, we asked listeners to assigned traits of age, gender, accent origin, and human–likeness. Results present a difference in perception for age and human–likeness across voices, and a general agreement across listeners for both gender and accent origin. Connections found between age, gender and human–likeness call for further exploration into a more participatory and inclusive synthesized vocal identity.

AB - Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily–life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we prescribe to ourselves. When the recorded voice is synthesized, does our perception of its new machine embodiment change, and can we consider an alternative, more inclusive form? To begin evaluating the impact of aesthetic design, this study presents a first–step perception test to explore the paralinguistic traits of the synthesized voice. Using a corpus of 13 synthesized voices, constructed from acoustic concatenative speech synthesis, we assessed the response of 23 listeners from differing cultural backgrounds. Evaluating if the perception shifts from the known ground–truths, we asked listeners to assigned traits of age, gender, accent origin, and human–likeness. Results present a difference in perception for age and human–likeness across voices, and a general agreement across listeners for both gender and accent origin. Connections found between age, gender and human–likeness call for further exploration into a more participatory and inclusive synthesized vocal identity.

KW - Faculty of Humanities

KW - user studies

KW - human-centered computing

M3 - Paper

ER -

ID: 178735315