女性声優の演技音声における年齢・性別の表現と関連する音響特徴量

林, 大輔; 森勢, 将雅

doi:10.51094/jxiv.1272

##article.authors##

林, 大輔日本たばこ産業株式会社 D-LAB https://researchmap.jp/d_s_hayashi
森勢, 将雅明治大学総合数理学部 https://researchmap.jp/mmorise

DOI:

https://doi.org/10.51094/jxiv.1272

キーワード:

音声、声優、演技、話者情報、個人性情報、音響特徴量

抄録

本研究では，女性声優が異なる年齢・性別を意図した演技音声を用いて，聴取実験ならびに音響特徴量の解析を行った。その結果，意図した5キャラクター（幼い少女・中高生程度の少女，大人の女性，老婆，少年）の聴取印象がそれぞれ異なっていることが示された。また，キャラクター表現と関連する音響特徴量について，女性4キャラクターについては実際の加齢変化との関連が見て取れた一方で，少年声については特有の表現が用いられている可能性が示唆された。考察では，より幅広いキャラクター表現を対象とすることで研究を発展させられる可能性を議論しつつ，メディア芸術におけるキャラクター表現の理解深耕や工学応用に向けた展望を述べる。

利益相反に関する開示

本論文において開示すべき利益相反は存在しない

ダウンロード *前日までの集計結果を表示します

ダウンロード実績データは、公開の翌日以降に作成されます。

引用文献

H. Fujisaki, “Prosody, models, and spontaneous speech,” in Computing Prosody, Y. Sagisaka, N. Campbell and N. Higuchi, Eds. (Springer, New York, 1996), pp. 27-42.

森大毅, 前川喜久雄, 粕谷英樹, 音声は何を伝えているか：感情・パラ言語情報・個人性の音声科学（コロナ社, 東京都, 2014）

栗田茂二朗, ”声帯の成長，発達と老化：とくに層構造の加齢的変化,” 音声言語医学, 29(2), 185-193 (1988).

I. R. Titze, “Physiologic and acoustic differences between male and female voices,” Journal of the Acoustical Society of America, 85(4), 1699-1707 (1989).

内田照久，森勢将雅, ”声のピッチ感の錯覚と疑似歌声・疑似ささやき声による検討,” 情報処理学会論文誌, 61(4), 807-816 (2020).

H. K. Vorperian, S. Wang, M. K. Chung, E. M. Schimek, R. B. Durtschi, R. D. Kent, A. J. Ziegert and L. R. Gentry, “Anatomic development of the oral and pharyngeal portions of the vocal tract: An imaging study,” Journal of the Acoustical Society of America, 125(3), 1666-1678 (2009).

H. K. Vorperian, S. Wang, E. M. Schimek, R. B. Durtschi, R. D. Kent, L. R. Gentry and M. K. Chung, “Developmental sexual dimorphism of the oral and pharyngeal portions of the vocal tract: An imaging study,” Journal of Speech, Language, and Hearing Research, 54(4), 995-1010 (2011).

石田美紀, アニメと声優のメディア史：なぜ女性が少年を演じるのか（青弓社, 東京都, 2020）

林大輔, “声優のキャラクター演技音声を用いた音声知覚に関する実験研究,” 愛知淑徳大学論集―人間情報学部篇, 9, 49-62 (2019).

丸島歩, “女性声優による役柄の性別の異なる音声の音響的特徴：基本周波数に着目して,” 大阪経済法科大学論集, 115, 23-33 (2020a).

丸島歩, “女性声優の演技音声にあらわれるジェンダーの表現：母音フォルマントに着目して,” 年報新人文学, 17, 165-139 (2020b).

A. Crochiquia, A. Eriksson, M. A. S. Fontes and S. Madureire, “A phonetic study of Zootopia characters' voices in Brazilian Portuguese: the role of stereotypes,” DELTA: Documentação de estudos em lingüística teórica e aplicada, 36(3), 1-46 (2020).

石井沙季, 伊藤克亘, “キャラクター音声のステレオタイプ識別のための音響分析,” 情報処理学会第81回全国大会講演論文集, 4, 695-696 (2019).

S. Kawahara, “The prosodic features of the "moe" and "tsun" voices,” Journal of the Phonetic Society of Japan, 20(2), 102-110 (2016).

酒井えりか, 伊藤彰教, 伊藤貴之, “ゲームキャラクタと声質の傾向分析,” 映像情報メディア学会技術報告, 40(11), 123-124 (2016).

佐藤茉奈花, “同一声優による異なる性格を持つキャラクターの演技音声の分析,” 社会言語科学会第47回大会発表論文集, 2-3, 29-32 (2023).

M. Teshigawara, “Vocally expressed emotions and stereotypes in Japanese animation: Voice qualities of the bad guys compared to those of the good guys,” Journal of the Phonetic Society of Japan, 8(1), 60-76 (2004).

勅使河原三保子, 伊藤克亘, 武田一哉, “日本のアニメの音声に表された感情と性格：声のステレオタイプの音声学的研究,” 電子情報通信学会技術研究報告, 105(291), 39-44 (2005).

日本声優統計学会, “声優統計コーパス,” https://voice-statistics.github.io/ (参照2025-05-14).

R. Sonobe, S. Takamichi and H. Saruwatari, “JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis,” arXiv preprint, 1711.00354, 1-4 (2017).

夢前黎, “つくよみちゃんコーパス│声優統計コーパス（JVSコーパス準拠）,” https://tyc.rei-yumesaki.net/material/corpus/ (参照2025-05-14).

木戸博, 粕谷英樹, “通常発話の声質に関連した日常表現語の抽出,” 日本音響学会誌，55(6)，405-411 (1999).

J. R. de Leeuw, R. A. Gilbert and B. Luchterhandt, “jsPsych: Enabling an open-source collaborative ecosystem of behavioral experiments,” Journal of Open Source Software, 8(85), 5351 (2023).

S. Takamichi, K. Mitsui, Y. Saito, T. Koriyama, N. Tanji and H. Saruwatari, “JVS corpus: free Japanese multi-speaker voice corpus,” arXiv preprint, 1908.06248, 1-4 (2019).

河原達也, 李晃伸, “連続音声認識ソフトウェア Julius,” 人工知能学会誌, 20(1), 41-49 (2005).

M. Morise, F. Yokomori and K. Ozawa, “WORLD: a vocoder-based high-quality speech synthesis system for real-time applications,” IEICE transactions on information and systems, E99-D(7), 1877-1884 (2016).

M. Morise, “Harvest: A high-performance fundamental frequency estimator from speech signals,” Interspeech, pp. 2321-2325 (2017).

M. Morise, “D4C, a band-aperiodicity estimator for high-quality speech synthesis,” Speech Communication, 84, 57-65 (2016).

日本音響学会(編), 新版音響用語辞典（コロナ社, 東京都, 2003）, p. 194.

古川茂人, “聴覚の心理物理学,” 内川惠二(編), 聴覚・触覚・前庭感覚（朝倉書店, 東京都, 2008）, p. 75.

S. S. Stevens and J. Volkmann, “The relation of pitch to frequency: A revised scale,” The American Journal of Psychology, 53(3), 329-353 (1940).

E. Schubert and J. Wolfe, “Does timbral brightness scale with frequency and spectral centroid?,” Acta Acustica united with Acustica, 92(5), 820-825 (2006).

森勢将雅, 音声分析合成（コロナ社, 東京都, 2018）

J. Hillenbrand, R. A. Cleveland and R. L. Erickson, “Acoustic correlates of breathy vocal quality,” Journal of Speech, Language, and Hearing Research, 37(4), 769-778 (1994).

J. Hillenbrand and R. A. Houde, “Acoustic correlates of breathy vocal quality: dysphonic voices and continuous speech,” Journal of Speech, Language, and Hearing Research, 39, 311-321 (1996).

横森文哉, 二宮大和, 森勢将雅, 田中章浩, 小澤賢司, “好感度評価の性差に着目した女性発話の音響特徴量分析,” 日本感性工学会論文誌, 15(7), 721-729 (2016).

R. Jurgens, K. Hammerschmidt and J. Fischer, “Authentic and play-acted vocal emotion expressions reveal acoustic differences,” Frontiers in Psychology, 2(180), 1-11 (2011).

生野琢郎, 森勢将雅, “演技発話による疲労の表現によって生じる音色変化の分析,” 電子情報通信学会技術研究報告, 117(393), 39-42 (2018).

R Core Team, “R: A language and environment for statistical computing” [Computer software] (2017).

JASP Team, “JASP (Version 0.18.3)” [Computer software] (2024).

B. L. Smith, J. Wasowicz and J. Preston, “Temporal characteristics of the speech of normal elderly adults,” Journal of Speech, Language, and Hearing Research, 30(4), 522-529 (1987).

木戸博, 箕輪有希子, 粕谷英樹, “声質表現語の音響関連量に関する非線形分析：決定木による方法,” 日本音響学会誌, 58(9), 586-588 (2002).

H. M. Hanson, “Glottal characteristics of female speakers: Acoustic correlates,” Journal of the Acoustical Society of America, 101(1), 466-481 (1997).

D. H. Klatt and L. C. Klatt, “Analysis, synthesis, and perception of voice quality variations among female and male talkers,” Journal of the Acoustical Society of America, 87(2), 820-857 (1990).

C. D. Aronovitch, “The voice of personality: stereotyped judgments and their relation to voice quality and sex of speaker,” Journal of Social Psychology, 99(2), 207-220 (1976).

K. R. Scherer, “Judging personality from voice: A cross-cultural approach to an old issue in interpersonal perception,” Journal of Personality, 40(2), 191-210 (1972).

K. R. Scherer, “Personality inference from voice quality: the loud voice of extroversion,” European Journal of Social Psychology, 8(4), 467-487 (1978).

内田照久, “音声の発話速度の制御がピッチ感及び話者の性格印象に与える影響,” 日本音響学会誌, 56(6), 396-405 (2000).

内田照久, “音声の発話速度が話者の性格印象に与える影響,” 心理学研究, 73(2), 131-139 (2002).

内田照久, “音声の発話速度と休止時間が話者の性格印象と自然なわかりやすさに与える影響,” 教育心理学研究, 53(1), 1-13 (2005a).

内田照久, “音声の韻律的特徴と話者のパーソナリティ印象の関係性,” 音声研究, 13(1), 17-28 (2009).

内田照久, 中畝菜穂子, “声の高さと発話速度が話者の性格印象に与える影響,” 心理学研究, 75(5), 397-406 (2004).

内田照久, “音声中の抑揚の大きさと変化パターンが話者の性格印象に与える影響,” 心理学研究, 76(4), 382-390 (2005b).

内田照久, “未知のイントネーションから想起される話者の性格印象と方言地域の特徴,” 音声研究, 10(3), 29-42 (2006).

内田照久, “音声中の母音の明瞭性が話者の性格印象と話し方の評価に与える影響,” 心理学研究, 82(5), 433-441 (2011).

P. Belin, B. Boehme and P. McAleer, “The sound of trustworthiness: Acoustic-based modulation of perceived voice personality,” PLoS ONE, 12(10), e0185651 (2017).

P. McAleer, A. Todorov and P. Belin, “How do you say 'Hello'? Personality impressions from brief novel voices,” PLoS ONE, 9(3), e90779 (2014).

勅使河原三保子, “声に関するステレオタイプの解明に向けて：音声に基づく人物像の知覚の 3 次元モデル,” 駒澤大学外国語論集, 27, 1-19 (2019).

G. Yovel and P. Belin, “A unified coding strategy for processing faces and voices,” Trends in Cognitive Sciences, 17(6), 263-271 (2013).

M. Kamachi, H. Hill, K. Lander and E. Vatikiotis-Bateson, “‘Putting the face to the voice’: Matching identity across modality,” Current Biology, 13, 1709-1714 (2003).

K. Lander, H. Hill, M. Kamachi and E. Vatikiotis-Bateson, “It's not what you say but the way you say it: matching faces and voices,” Journal of Experimental Psychology: Human Perception and Performance, 33(4), 905-914 (2007).

重野純, 本心は顔より声に出る：感情表出と日本人（新曜社, 東京都, 2020）

田中章浩, 顔を聞き，声を見る：私たちの多感覚コミュニケーション（共立出版, 東京都, 2022）

P. Laukka, D. Neiberg, M. Forsell, I. Karlsson and K. Elenius, “Expression of affect in spontaneous speech: Acoustic correlates and automatic detection of irritation and resignation,” Computer Speech and Language, 25(1), 84-104 (2011).

C. E. Williams and K. N. Stevens, “Emotions and speech: Some acoustical correlates,” Journal of the Acoustical Society of America, 52(4B), 1238-1250 (1972).

K. R. Scherer, “Vocal communication of emotion: A review of research paradigms,” Speech Communication, 40(1-2), 227-256 (2003).

俣野文義, 小口純矢, 森勢将雅, “嫌悪感情を意図して発話された日本語演技音声の音響特徴量分析と話者間比較,” 日本音響学会誌, 81(1), 64-72 (2025).

原雄太郎, 伊藤克亘, “声優の発話の音響特徴量分析及び確率モデルの作成,” 情報科学技術フォーラム講演論文集, 8(2), 369-372 (2009).

C.T. Ishi, A. Utsugi and I. Ota, “Voice types and voice quality in Japanese anime,” Proceedings of the 20th International Congress of Phonetic Sciences, pp. 3632-3636 (2023).

R. L. Starr, “Sweet voice: The role of voice quality in a Japanese feminine style,” Language in Society, 44(1), 1-34 (2015).

A. Utsugi, H. Wang and I. Ota, “A voice quality analysis of Japanese anime,” Proceedings of the 19th International Congress of the Phonetic Sciences, pp. 1853-1857 (2019).

高道慎之介, “音声アバターを選ぶ時代：ボイスチェンジャー技術の動向,” 電気学会誌, 141(2), 93-96 (2021).

松本大輝, “フィクショナル・キャラクターとしてのVTuber,” 岡本健, 山野弘樹, 吉川彗(編著), VTuber学（岩波書店, 東京都, 2024）