Please submit manuscripts in either of the following two submission systems

    ScholarOne Manuscripts

  • ScholarOne
  • 勤云稿件系统

  • 登录

Search by Issue

  • 2024 Vol.31
  • 2023 Vol.30
  • 2022 Vol.29
  • 2021 Vol.28
  • 2020 Vol.27
  • 2019 Vol.26
  • 2018 Vol.25
  • 2017 Vol.24
  • 2016 vol.23
  • 2015 vol.22
  • 2014 vol.21
  • 2013 vol.20
  • 2012 vol.19
  • 2011 vol.18
  • 2010 vol.17
  • 2009 vol.16
  • No.1
  • No.2

Supervised by Ministry of Industry and Information Technology of The People's Republic of China Sponsored by Harbin Institute of Technology Editor-in-chief Yu Zhou ISSNISSN 1005-9113 CNCN 23-1378/T

期刊网站二维码
微信公众号二维码
Related citation:SHAO Yan-qiu,SUI Zhi-fang,HAN Ji-qing.Pitch models of Mandarin text-to-speech[J].Journal of Harbin Institute Of Technology(New Series),2009,16(2):179-184.DOI:10.11916/j.issn.1005-9113.2009.02.006.
【Print】   【HTML】   【PDF download】   View/Add Comment  Download reader   Close
←Previous|Next→ Back Issue    Advanced Search
This paper has been: browsed 681times   downloaded 279times 本文二维码信息
码上扫一扫!
Shared by: Wechat More
Pitch models of Mandarin text-to-speech
Author NameAffiliation
SHAO Yan-qiu Institute of Computational Linguistics, Peking University, Peking 100871, China, {yqshao,szf}@pku.edu.cn
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001,China 
SUI Zhi-fang Institute of Computational Linguistics, Peking University, Peking 100871, China, {yqshao,szf}@pku.edu.cn 
HAN Ji-qing School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001,China 
Abstract:
The function of prosody model will directly affect the naturalness of synthesized speech. Aimed at the difficulty in generating the pitch contour in prosody model, two pitch models namely corpus-based pitch model and pitch pattern model are deeply studied in this paper. Key problems in the corpus-based model are calculation of the distance and searching of the optimal path with dynamic programming algorithm. For the pitch pattern model, parameters such as pitch pattern, pitch average and pitch range are used to describe the pitch contour, and six pitch patterns are presented. For the generation of pitch contour, the pitch pattern model is more flexible than the corpus-based model. Both of the two models are linked to the real TTS system, and the MOS results of synthesized Mandarin speech show that the pitch pattern model is better than the corpus-based pitch model.
Key words:  speech synthesis  prosody model  pitch model  pitch pattern
DOI:10.11916/j.issn.1005-9113.2009.02.006
Clc Number:TP391.4
Fund:

LINKS