上海大学学报(自然科学版)

• 通信与信息工程 • 上一篇    下一篇

基于倒谱距离窗移最小失真分割的语种辨识

缪炜,侯丽敏   

  1. 上海大学 通信与信息工程学院,上海 200072

  • 收稿日期:2006-06-22 修回日期:1900-01-01 出版日期:2007-04-30 发布日期:2007-04-30
  • 通讯作者: 侯丽敏

Language Identification Based on Minimum Distortion of Cepstrum Distance Segmentation

MIAO Wei, HOU Li-min

  

  1. School of Communication and Information Engineering, Shanghai University, Shanghai 200072, China
  • Received:2006-06-22 Revised:1900-01-01 Online:2007-04-30 Published:2007-04-30
  • Contact: HOU Li-min

摘要: 提出一种语种辨识的新方法.采用一种无需对语音文件进行标注的方法,提出基于倒谱距离窗移最小失真分割子词,在语种辨识前端用子词的自动分割方法把语音信号分割成许多子词.对得到的所有子词进行聚类并对每一类建立一个隐马尔可夫模型(HMM),最后利用得到的所有的子词模型对输入语音进行语种辨识.实验表明,该方法是一种简洁而且有效的语种辨识方法.

关键词: 隐马尔可夫模型, 语种辨识, 子词分割

Abstract:

We propose a novel approach to language identification. Generally speaking, an ideal language identification system needs a large number of speech transcriptions at the phoneme level for training the phone model, involving a huge amount of work and cost. In this project, we use a rough segmentation instead of transcription to produce sub-words, and a front-end sub-words recognizer for individual languages to be identified. This is followed by clustering the sub-words and creating an HMM for each cluster. Preliminary results on language identification are provided to demonstrate simplicity and effectiveness of this approach.

Key words: language identification, sub-words segmentation
,
hidden markov model (HMM)

中图分类号: