ES 202 211-2003

Speech Processing@ Transmission and Quality Aspects (STQ); Distributed speech recognition; Extended front-end feature extraction algorithm; Compression algorithms; Back-end speech reconstruction algorithm (V1.1.1)


 

 

非常抱歉,我们暂时无法提供预览,您可以试试: 免费下载 ES 202 211-2003 前三页,或者稍后再访问。

如果您需要购买此标准的全文,请联系:

点击下载后,生成下载文件时间比较长,请耐心等待......

 

标准号
ES 202 211-2003
发布日期
2003年08月01日
实施日期
2014年04月04日
废止日期
中国标准分类号
/
国际标准分类号
/
发布单位
ETSI - European Telecommunications Standards Institute
引用标准
73
适用范围
"The present document specifies algorithms for extended front-end feature extraction@ their transmission@ back-end pitch tracking and smoothing@ and back-end speech reconstruction which form part of a system for distributed speech recognition. The specification covers the following components: a) the algorithm for front-end feature extraction to create Mel-Cepstrum parameters; b) the algorithm for extraction of additional parameters@ viz.@ fundamental frequency F0 and voicing class; c) the algorithm to compress these features to provide a lower data transmission rate; d) the formatting of these features with error protection into a bitstream for transmission; e) the decoding of the bitstream to generate the front-end features at a receiver together with the associated algorithms for channel error mitigation; f) the algorithm for pitch tracking and smoothing at the back-end to minimize pitch errors; g) the algorithm for speech reconstruction at the back-end to synthesize intelligible speech. NOTE: The components (a)@ (c)@ (d)@ and (e) are already covered by the ES 201 108 [1]. Besides these (four) components@ the present document covers the components (b)@ (f)@ and (g) to provide back-end speech reconstruction and enhanced tonal language recognition capabilities. If these capabilities are not of interest@ the reader is better served by (un-extended) ES 201 108 [1]. The present document does not cover the ""back-end"" speech recognition algorithms that make use of the received DSR front-end features. The algorithms are defined in a mathematical form@ pseudo-code@ or as flow diagrams. Software implementing these algorithms written in the 'C' programming language will be provided with the final published version of the present document. Conformance tests are not specified as part of the standard. The recognition performance of proprietary implementations of the standard can be compared with those obtained using the reference 'C' code on appropriate speech databases. It is anticipated that the DSR bitstream will be used as a payload in other higher level protocols when deployed in specific systems supporting DSR applications. The Extended Front-End (XFE) standard incorporates tonal information@ viz.@ fundamental frequency F0 and voicing class@ as additional parameters. This information can be used for enhancing the recognition accuracy of tonal languages@ e.g. Mandarin@ Cantonese@ and Thai. The Extended Front-End (XFE) standard incorporates Voice Activity information as part of the voicing class information. This can be used for segmentation (or end-point detection) of the speech data for improved recognition performance."




Copyright ©2007-2022 ANTPEDIA, All Rights Reserved
京ICP备07018254号 京公网安备1101085018 电信与信息服务业务经营许可证:京ICP证110310号