Speaker Identification in Whisper





Sociophonetics, Forensic phonetics, Whisper


Sociophonetic methods and findings have value in application to real-life issues, including providing expert forensic evidence in legal cases. Forensic cases often involve voices which differ markedly from those typically encountered in laboratory or field studies. We assess the ability of people to identify familiar voices produced in whisper, a commonly used form of disguise. Members of a pre-existing social network were recorded speaking normally and in whisper. Speakers found it difficult to maintain whisper beyond 30 seconds. They and other members of the group listened to extracts that were (i) short and whispered, (ii) long and whispered, and (iii) short and normal (non-whispered). Foils were also included. Performance was well above chance, and improved significantly in conditions (ii) and (iii). Differences were found across listeners and voices. The study emphasises how important is it not to overgeneralise from experimental data to a witness’s ability under forensic conditions.


Identificação de falante a partir de fala sussurrada

Resumo: Os métodos e achados sociofonéticos são valiosos para aplicação a questões da vida real, como no fornecimento de evidência forense pericial em casos legais. Os casos forenses envolvem vozes que se diferem marcadamente daquelas tipicamente encontradas em laboratório ou estudos de campo. Avaliamos a habilidade de pessoas de identificar vozes familiares produzidas de forma sussurrada, uma estratégia de disfarce comumente utilizada. Membros de uma rede social pré-existente foram gravados falando normalmente e de forma sussurrada. Os falantes consideraram difícil manter o sussurro por mais do que 30 segundos. Esses falantes e outros membros do grupo ouviram trechos que foram (i) curtos e sussurrados; (ii) longos e sussurrados e (iii) curtos e normais (não sussurrados). Distratores foram incluídos. A performance foi bem acima do acaso e melhorou significativamente nas condições (ii) e (iii). Diferenças foram encontradas entre falantes e vozes. O estudo enfatiza o quanto é importante não supergeneralizar a partir de dados experimentais quanto à habilidade da testemunha sob condições forenses.

Palavras-chave: Sociofonética; Fonética Forense; Sussurro


Não há dados estatísticos.


AHMADI, Farzaneh; MCLOUGHLIN, Ian V.; SHARIFZADEH, Hamid R. Analysis-by-synthesis method for whisper-speech reconstruction. Proceedings of IEEE Asia Pacific Conference Circuits and Systems. 2008. p. 1280-1283.

ALEXANDER, Anil. Forensic automatic speaker recognition using Bayesian interpretation and statistical compensation for mismatched conditions. International Journal of Speech, Language and the Law, v. 14, n. 1, p. 145-156, 2007.

BARTLE, Anna; DELLWO, Volker. Auditory speaker discrimination by forensic phoneticians and naive listeners in voiced and whispered speech. International Journal of Speech, Language and the Law, v. 22, n. 2, p. 229-248, 2015.

BLATCHFORD, Helen; FOULKES, Paul. Identification of voices in shouting. International Journal of Speech, Language and the Law, v. 13, n. 2, p. 241-254, 2006.

BULL, Ray; CLIFFORD Brian R. Earwitness voice recognition accuracy. In: WELLS, Gary L.; LOFTUS, Elizabeth F. (Ed.). Eyewitness testimony: psychological perspectives. Cambridge: Cambridge University Press, 1984. p. 92-123.

CHIN, Steven B.; PISONI, David B. Alcohol and speech. New York: Academic Press, 1997.

CLARK, Jessica; FOULKES, Paul. Identification of voices in electronically disguised speech. International Journal of Speech, Language and the Law, v. 14, n. 2, p. 195-221, 2007.

DE FIGUEIREDO, Ricardo M.; DE SOUZA BRITTO, Helena. A report on the acoustic effects of one type of disguise. Forensic Linguistics, v. 3, n. 1, p. 168-175, 1996.

DOCHERTY, Gerard; MENDOZA-DENTON, Norma. Speaker-related variation–sociophonetic factors. In: COHN, Abigail C.; FOUGERON, Cécile; HUFFMAN, Marie K. (Ed.). Oxford handbook of laboratory phonology. Oxford: Oxford University Press, 2012. p. 44-60.

FOULKES, Paul; BARRON, Anthony. Telephone speaker recognition amongst members of a close social network. Forensic Linguistics, v. 7, n. 2, p. 180-198, 2000.

FOULKES, Paul; FRENCH, Peter. Forensic speaker comparison: a linguistic-acoustic perspective. In: TIERSMA, Peter; SOLAN, Larry (Ed.). Oxford handbook of language and law. Oxford: Oxford University Press, 2012. p. 557-572.

FOULKES, Paul; HAY, Jennifer. The emergence of sociophonetic structure. In: MACWHINNEY, Brian; O'GRADY, William (Ed.). Handbook of language emergence. Oxford: Blackwell, 2015. p. 292-313.

FOULKES, Paul; SCOBBIE, James M.; WATT, Dominic J.L. Sociophonetics. In: HARDCASTLE, William; LAVER, John; GIBBON, Fiona (Eds.). Handbook of phonetic sciences. 2. ed. Oxford: Blackwell, 2010. p. 703-754.

FRENCH, Peter; HARRISON, Philip; WINDSOR LEWIS, Jack. R v John Samuel humble: The Yorkshire ripper hoaxer trial. International Journal of Speech Language and the Law, v. 13, n. 2, p. 255-273, 2007.

GICK, Bryan; WILSON, Ian; DERRICK, Donald. Articulatory phonetics. New York: John Wiley & Sons, 2013.

HAMMERSLEY, Richard; READ, J. Don. The effect of participation in a conversation on recognition and identification of the speakers' voices. Law and Human Behavior, v. 9, n. 1, p. 71-81, 1985.

HOLLIEN, Harry; MAJEWSKI, Wojciech; DOHERTY, E. Thomas. Perceptual identification of voices under normal, stress and disguise speaking conditions. Journal of Phonetics, v. 10, p. 139-148, 1982.

HURLBERT, Stuart H. Pseudoreplication and the design of ecological field experiments. Ecological Monographs, v. 54, n. 2, p. 187-211, 1984.

JAEGER, T. Florian. Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, v. 59, n. 4, p. 434-446, set. 2008.

KALLAIL, Ken J.; EMANUEL, Floyd W. An acoustic comparison of isolated whispered and phonated vowel samples produced by adult male subjects. Journal of Phonetics, v. 12, n. 1, p. 175-186, 1984a.

KALLAIL, Ken J.; EMANUEL, Floyd W. Formant-frequency differences between isolated whispered and phonated vowel samples produced by adult female subjects. Journal of Speech and Hearing Research, v. 27, n. 1, p. 245-251, 1984b.

KÜNZEL, Hermann J. Effects of voice disguise on speaking fundamental frequency. International Journal of Speech Language and the Law, v. 7, n. 2, p. 150-179, 2000.

LADEFOGED, Peter; LADEFOGED, Jenny. The ability of listeners to identify voices. UCLA Working Papers in Phonetics, v. 49, p. 43-51, 1980.

LASS, Norman J.; HUGHES, Karen R.; BOWYER, Melanie D.; WATERS, Lucille T.; BOURNE, Victoria T. Speaker sex identification from voiced, whispered, and filtered isolated vowels. Journal of the Acoustical Society of America, v. 59, n. 3, p. 675-678, 1976.

LAVER John. The phonetic description of voice quality. Cambridge: Cambridge University Press, 1980.

LAVER, John. Principles of phonetics. Cambridge: Cambridge University Press, 1994.

MASTHOFF, Herbert. A report on a voice disguise experiment. Forensic Linguistics, v. 3, n. 1, p. 160-167, 1996.

MCGEHEE, Frances. The reliability of the identification of the human voice. Journal of General Psychology, v. 17, n. 2, p. 249-271, 1937.

NOLAN Francis. Speaker recognition and forensic phonetics. In: HARDCASTLE, William; LAVER, John (Ed.). Handbook of phonetic sciences. Oxford: Blackwell, 1997. p. 744-767.

NOLAN, Francis. A recent voice parade. International Journal of Speech, Language and the Law, v. 10, n. 2, p. 277-291, 2003.

ORCHARD, T. L.; YARMEY, A. D. The effects of whispers, voice-sample duration, and voice distinctiveness on criminal speaker identification. Applied Cognitive Psychology, v. 9, n. 3, p. 249-260, 1995.

PAPP, Viktória. The effects of heroin on speech and voice. MSc dissertation, University of York, 2008.

POLLACK, I.; PICKETT, J. M.; SUMBY, W. H. On the identification of speakers by voice. Journal of the Acoustical Society of America, v. 26, n. 3, p. 403-406, 1954.

RHODES, Richard. Assessing the strength of non-contemporaneous forensic speech evidence. PhD dissertation, University of York, 2012.

ROBERTS, Lisa. Vocal responses of individuals in distress. PhD dissertation, University of York, 2012.

SCHWARTZ, Martin F. Power spectral density measurements of oral and whispered speech. Journal of Speech and Hearing Research, v. 13, p. 445-446, 1970.

SCHWARTZ, Martin F. Bilabial closure durations for /p/, /b/, and /m/ in voiced and whispered vowel environments. Journal of the Acoustical Society of America, v. 51, p. 2025-2029, 1972.

SCHWARTZ, Martin F.; RINE, Helen E. Identification of speaker sex from isolated, whispered vowels. Journal of the Acoustical Society of America, v. 44, n. 6, p. 1736-1737, 1968.

SEYFARTH, Scott. Word informativity influences acoustic duration: Effects of contextual predictability on lexical representation. Cognition, v. 133, n. 1, p. 140-155, 2014.

STURM, Ruth; JAKIMIK, Jola. The perception of whispered speech. Journal of the Acoustical Society of America, v. 76, p. 29, 1984.

SUNDBERG, Johan; SCHERER, Ronald; HESS, Markus; MÜLLER, Frank. Whispering—a single-subject study of glottal configuration and aerodynamics. Journal of Voice, v. 24, n. 5, p. 574-584, 2010.

SWERDLIN, Yoni; SMITH, John; WOLFE, Joe. The effect of whisper and creak vocal mechanisms on vocal tract resonances. Journal of the Acoustical Society of America, v. 127, n. 4, p. 2590-2598, 2010.

TARTTER, Vivien C. What's in a whisper? Journal of the Acoustical Society of America, v. 86, n. 5, p. 1678-1683, 1989.

TSUNODA, Koichi; OHTA, Yasushi; NIIMI, Seiji; SODA, Yasushi; HIROSE, ajime. Laryngeal adjustment in whispering: magnetic resonance imaging study. Annals of Otology, Rhinology & Laryngology, v. 106, n. 1, p. 41 -43, 1997.

VAN LANCKER, Diana; KREIMAN, Jody; EMMOREY, Karen. Familiar voice recognition: patterns and parameters. Journal of Phonetics, v. 13, n. 5, p. 19-38, 1985.

WELLS, John C. Accents of English. Cambridge: Cambridge University ress, 1982. 3 v.

YARMEY A. Daniel; YARMEY A. Linda; YARMEY Meagan J.; PARLIAMENT Lisa. Commonsense beliefs and the identification of familiar voices. Applied Cognitive Psychology, v. 15, n. 5, p. 283-299, 2001.

ZHANG, Chi; HANSEN, John H. Whisper-island detection based on unsupervised segmentation with entropy-based speech feature processing. IEEE Transactions on Audio, Speech, and Language Processing, v. 19, n. 4, p. 883-894, 2011.

ZHANG, Cuiling; TAN, Tiejun. Voice disguise and automatic speaker recognition. Forensic Science International, v. 175, n. 2, p. 118-122, 2008.



Como Citar

Foulkes, P., Smith, I., & Sóskuthy, M. (2017). Speaker Identification in Whisper. Letras De Hoje, 52(1), 5–14. https://doi.org/10.15448/1984-7726.2017.1.26659




Artigos mais lidos pelo mesmo(s) autor(es)