Subject: HMM-Based Speech Synthesis Toolkit version 1.1.1 From: Keiichi TokudaTo: hts: ; Cc: tokuda@ics.nitech.ac.jp Date: Tue, 30 Dec 2003 18:06:44 +0900 (JST) X-Mailer: Mew version 2.3 on Emacs 20.7 / Mule 4.1 (°ª) Sorry for spamming but this BCC email is to let you notice the release of our software The HMM-based Speech Synthesis System (HTS) version 1.1.1 release December 26, 2003 for HMM-based speech synthesis. We apologize if you receive multiple copies. Please check /http://hts.ics.nitech.ac.jp/ A brief explanation of this software is attached bellow. We would appreciate it if you would distribute this email to anyone who would be interested in this software. Best regards, Keiichi Tokuda tokuda@ics.nitech.ac.jp http://kt-lab.ics.nitech.ac.jp/~tokuda/ **************************************************************** The HMM-based Speech Synthesis System (HTS) version 1.1.1 release December 26, 2003 The HMM-Based Speech Synthesis System (HTS) (http://hts.ics.nitech.ac.jp/) has been being developed by the HTS working group (see "Who we are" below) and others (see "Acknowledgments" below). The training part of HTS was implemented as a modified version of the HTK (http://htk.eng.cam.ac.uk/) together with the SPTK (http://kt-lab.ics.nitech.ac.jp/~tokuda/SPTK/). Modifications which we made to HTK are listed below: - Context clustering based on MDL criterion (instead of ML one) - Stream-dependent context clustering - Multi-space probability distribution as state output probability (for pitch pattern modeling) - State duration modeling and clustering Related publications about the techniques and algorithms used in HTS can be found at http://hts.ics.nitech.ac.jp/publications.html The current version does not include any text analyzer but the Festival Speech Synthesis System (http://www.festvox.org/festival/) can be used as a text analyzer. HTS version 1.1.1 includes a small run-time synthesis engine (less than 1 M byte including HMMs). Since the synthesis engine can run without the HTK library, it is suitable for using on the Festival. This distribution comes with a demo script using "CMU ARCTIC US English awb" (http://www.festvox.org/cmu_arctic/dbs_awb.html), which generates "voices" for Festival. Two HTS voices for Festival trained by using 450 utterances of "CSTR US KED Timit" (http://www.festvox.org/dbs/dbs_kdt.html) and 523 utterances of "CMU Communicator KAL limited domain" (http://www.festvox.org/dbs/dbs_com.html), respectively, and four HTS voices for Festival trained by using "CMU ARCTIC database" (http://www.festvox.org/cmu_arctic/) are also released with HTS version 1.1.1. Each of HTS voices consists of HMMs trained by the demo script and the small run-time synthesis engine, and can be used as a "voice" of Festival Speech Synthesis System without any other HTS tools. *** Notes for Japanese speech synthesis *** A demo script using the NIT database for speech synthesis "NIT JP ATR503 m001" is also prepared for training Japanese voices. Voices trained by the demo script can be used on GalateaTalk, which is a speech synthesis module of an open-source toolkit for anthropomorphic spoken dialogue agents developed in Galatea project (http://hil.t.u-tokyo.ac.jp/~galatea/), without any other HTS tools. An HTS voice for GalateaTalk trained by the demo script is also released with HTS version 1.1.1. **************************************************************** What's new in version 1.1.1 **************************************************************** - Based on HTK-3.2.1 - Demo script for ARCTIC database - Demo script for an original database (Japanese) - Variance flooring in demo script - Postfiltering in hts-engine - Many fixed bugs ****************************************************************