Support of SSML 1.0

CU VOCAL text-to-speech engine has supported SSML 1.0 since Mar 2004:

  • Support tags in “Prosody and Style” category.
  • <prosody> : Controls the pitch, speaking rate and volume of the speech output.
  • <emphasis> : Requests that the contained text be spoken with emphasis.

Examples generated with CU VOCAL

<prosody>

Attribute Sample Sentences
pitch

我是<prosody pitch='x-high'>合唱團的女高音</prosody>

我是<prosody pitch='x-low'>合唱團的女低音</prosody>

rate

我有緊要事要走先,你地慢慢傾

<prosody rate='fast'>我有緊要事要走先,你地<prosody rate='x-slow'>慢慢傾</prosody></prosody>

volume

現在已經係夜深,請將音量收細

現在已經係夜深,<prosody volume='x-soft'>請將音量收細</prosody>

<emphasis>

Attribute Sample Sentences
level

我宜家好肚餓

我宜家<emphasis level='strong'>好肚餓</emphasis>

Integrated example

事關4月差不多所有數碼相機都以500萬像素推出新型號,所以500萬像素以下機種便要大出血! 其實三百、四百萬像素在日常生活已足夠應用買平機是時候

corresponding SSML tags:

事關4月差不多<emphasis level='strong'>所有</emphasis>數碼相機都以<prosody pitch='x-high'>500萬像素</prosody>推出新型號,所以<prosody rate='1.4'>500萬像素以下機種</prosody>便要<prosody pitch='x-low' rate='x-slow'>大出血</prosody>! 其實<prosody volume='loud'>三百、四百萬像素</prosody>在日常生活已<prosody rate='slow'>足夠應用</prosody><emphasis level='strong'>買平機是時候</emphasis>