SCIENTIFIC AND TECHNICAL AEROSPACE REPORTS
A Biweekly Publication of the National Aeronautics and Space Administration
VOLUME 44, ISSUE 2 - January 27, 2006
63 CYBERNETICS, ARTIFICIAL INTELLIGENCE AND ROBOTICS
Includes feedback and control theory, information theory, machine learning, and expert systems.
For related information see also 54 Man/System Technology and Life Support.
20060003061 International Business Machines Japan Ltd., Japan
HMM-Based Speech Recognition Using Multi-Dimensional Multi-Labeling
Nishimura, Masafumi; Toshioka, Koichi; IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87); Volume 2; 1987, pp. 27.11.1 - 27.11.4; In English; See also 20060003045; Copyright; Avail.: Other Sources
This paper describes a new vector quantization (VQ; so-called labeling) method of a speech recognition system based on hidden Markov model (HMM). For improving the VQ accuracy in a simple manner, 'multi-labeling' which generates multiple labels at each frame was introduced while keeping a conventional HMM formulation. Furthermore, in order to represent characteristics of speech accurately and effectively 'multi-dimensional labeling' was also introduced which quantizes multipel features such as spectral dynamics and spectrum independently. This labeling method was tested in an isolated word recogniton task using 150 Japanese confusable words. The recognition error rate was roughly reduced to 1/2 or less compared with the conventional method. Author
Japan; Speech Recognition; Words (Language)
20060003062 Polytechnic Univ., Brooklyn, NY, USA
| |
| Tools for Aviation/Aerospace |
| IHS sells products and services designed to meet the needs of today's engineers. To learn more, and for a free quote, please complete the form below. |
|
Multiple Input Adaptive Iterative Image Restoration Algorithms
Katsaggelos, Aggelos K.; IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87); Volume 2; 1987, pp. 28.1.1 - 28.1.4; In English; See also 20060003045; Copyright; Avail.: Other Sources
In this paper image restoration applications where multiple distorted versions of the same original image are available are considered. A general adaptive iterative restoration algorithm is derived based on regularization techniques. The adaptivity of the algorithm is introduced in two ways: a) by a constraint operator which incorporates properties of the response of the human visual system into the restoration process, and b) by a weight matrix which assigns greater importance for the deconvolution process to areas of high spatial activity than to areas of low spatial activity. Different degrees of trust are assigned to the various distorted images depending on the amount of noise on catch image. The proposed algorithms are general and can be used for any type of linear distortion and constraint operators. It can also be used to restore signals other than images. Author
Algorithms; Restoration; Image Analysis
20060003096 Siemens A.G., Munich, Germany
Hierarchical Encoding of Image Sequences Using Multistage Vector Quantization
Hammer, B.; Brandt, A. v.; Schielein, M.; IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87); Volume 2; 1987, pp. 25.2.1 - 25.2.4; In English; See also 20060003045; Copyright; Avail.: Other Sources
Vector quantization is a promising encoding technique especially for low data rate image transmission. Due to the exponential growing of its computational complexity with the block dimension, however, only small blocksizes have been used in practical applications. This restricts the coding efficiency and produces some blockiness in the reconstructed images. Our proposal solves this problems by a combination of a block-overlapping pyramidal transform with multistage VQ. This concept enables VQ of large blocks in a hierarchical manner with small computational costs, while the block-overlapping principle gives rise to a smooth images reconstruction. The simultation results proved that picture phone sequences are reconstructed without any annoying artifacts. Author
Coding; Sequencing; Vector Quantization
20060003102 Speech Technologies Lab., Santa Barbara, CA, USA
| |
| Aerospace Engineering Design |
| ESDU packages provide validated design data, methods and software, offering a valuable toolset to aerospace engineers. To learn more, and for a free quote, please complete the form below. |
|
Weighted Cepstral Distance Measures in Vector Quantization Based Speech Recognizers
Applebaum, Ted H.; Hanson, Brian A.; Wakita, Hisashi; IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87); Volume 2; 1987, pp. 27.9.1 - 27.9.4; In English; See also 20060003045; Copyright; Avail.: Other Sources
This paper extends the use of weighted cepstral distance measures to speaker independent word recognizers based on vector quantization. Recognition results were obtained for two recognition methods: dynamic timewarping of vector codes and hidden Markov modeling. The experiments were carried out on a vocabulary of the ten digits and the word 'oh'. Two kinds of spectral analysis were cnsidered: LPC, and a recently proposed, low dimensional, perceptually based representation (PLP). The effects of analysis order and varying degrees of quantization in the spectral representation were also considered. Recognition experiments indicate that the performance of the weighted cepstral distance with vector quantized spectral data is considerably different from that previously reported for unquantized data. Comparison of recognition rates shows wide variations due to interaction of the distance measure with the analysis technique and with vector quantization. The best recognition scores were obtained by the combination of weighted cepstral distance and low order PLP analysis. This combination maintained good recognition rates down to very low (16 or 8 codes) codebook sizes. Author
Cepstral Analysis; Vector Quantization; Words (Language); Emission Spectra
20060003124 American Telephone and Telegraph Co., NJ, USA, Massachusetts Inst. of Tech., Cambridge, MA, USA
Beyond Quasi-Stationarity: Designing Time-Frequency Representations for Speech Signals
Riley, Michael D.; IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87); Volume 2; 1987, pp. 657-660; In English; See also 20060003045; Copyright; Avail.: Other Sources
This work addresses two related questions. The first is what joint time-frequency energy representations are most appropriate for speech signals, in particular, for the analysis of formant structure. Quasi-stationarity is not assumed, since it neglects dynamic regions. A set of desired properties is proposed, and a subclass of the quadratic transforms that best meets these criteria is derived, which consists of two-dimensionally smoothed Wigner distributions with gaussian kernels. The second question addressed is how to obtain suitable symbolic descriptions of the phonetically relevant features in these time-frequency surfaces. We propose time-frequency ridges in these surfaces, the 2-D analog of spectral peaks, which can be found by examining the derivatives of the time-frequency surface produced above. Author
Frequencies; Speech Recognition
20060003134 Institut National de la Recherche Scientifique, Montreal, Quebec, Canada
Integration of Acoustic Information in a Large Vocabulary Word Recognizer
Gupta, V. N.; Lennig, M.; Mermelstein, P.; IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87); Volume 2; 1987, pp. 697-700; In English; See also 20060003045; Copyright; Avail.: Other Sources
This paper proposes a new way of using vector quantization for improving recognition performance for a 60,000 word vocabulary speaker-trained isolated word recognizer using a phonemic Markov Model approach to speech recognition. We show that we can effectively increase the codebook size by dividing the feature vector into two vectors of lower dimensionality, and then quantizing and training each vector separately. For a small codebook size, integration of the results of the two parameter vectors provides significant improvement in recognition performance as compared to the quantizing and training of the entire feature set together. Even for a codebook size as small as 64, the results obtained when using the new quantization procedure are quite close to those obtained when using Gaussian distribution of the parameter vectors. Author
Speech Recognition; Words (Language); Acoustic Frequencies
20060003156 SRI International Corp., USA
Lexical Access with Lattice Input
Murveit, Hy;Weintraub, Mitchel; Cohen, Michael; Bernstein, Jared; IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87); Volume 2; 1987, pp. 20.11.1 - 20.11.4; In English; See also 20060003045; Copyright; Avail.: Other Sources
This paper describes an alternative approach to lexical access in the CMU ANGEL speech recognition system. Using this approach, the asynchronous phonetic hypotheses generated by an acoustic-phonetics module are converted to a directed graph. This graph is compared to a pronunciation dictionary. Performance results for this approach and the original CMU approach are similar. An error analysis indicated promises directions for further work. Author
Graph Theory; Phonetics; Speech Recognition
20060003195 Georgia Inst. of Tech., Atlanta, GA, USA
Hidden Markov Model Speech Recognition Based on Kalman Filtering
Clements, Mark A.; Lim, Sungjae; IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87); Volume 2; 1987, pp. 1147-1150; In English; See also 20060003045; Copyright; Avail.: Other Sources
Traditional hidden Markov model speech recognition is generally based on a set of parameters (often LPC related) which are extracted at discrete intervals. Such an analysis necessitates use of a discrete-trial, hidden Markov model in which the underlying states can only change at intervals related to the frame rate of the analysis. The exact locations of the analysis windows used can influence the front-end outputs and a result can cause confusion between words differing in short-duration consonants. In the current study, an alternate method which does not require segmentation is proposed, and a simple version is implemented. The discrete trial hidden Markov model algorithms are adapted to this framework leading to significantly improved recognition performance. Author
Kalman Filters; Speech Recognition; Words (Language)
20060003198 Speech Technologies Lab., Santa Barbara, CA, USA
An Efficient Speaker-Independent Automatic Speech Recognition by Simulation of Some Properties of Human Auditory Perception
Hermansky, Hynek; IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87); Volume 2; 1987, pp. 1159-1162; In English; See also 20060003045; Copyright; Avail.: Other Sources
An auditory model of speech perception, the Perceptually based linear predictive analysis of Root power sum metric (PLP-RPS), is applied as the front-end of an automatic speech recognizer (ASR). The PLP-RPS front-end is compared with standard linear predictive-cepstral metric (LP-CEP) front-end, and with LP-RPS and PLP-CEP front-ends. The two-spectralpeak models are the most efficient in modeling of linguistic information in speech. Consequently, in speaker-independent ASR, high analysis order front-ends are less effective than low-order front-ends. Synthetic speech is used for front-end evaluation. Some of perceptual inconsistencies of standard LP front-ends are alleviated in PLP front-ends. The PLP-RPS front-end is most sensitive to harmonic structure of speech spectrum. Perceptual experiments indicate similar tendencies in human auditory perception. Author
Auditory Perception; Cepstral Analysis; Simulation; Speech Recognition
Source: NASA
|
IHS sells products and services designed to meet the needs of today's aviation & aerospace engineers, including:
- Quick access to FAA, JAA, ICAO and UK-CAA information and regulations.
- Validated engineering methods, data, principles, worked examples, programs and related equations on over 1340 specific aerospace, process, structural and mechanical engineering topics.
- The IHS Fasteners eCatalog, providing decision support for the identification, specification and sourcing of aerospace & defense standard fasteners/hardware such as bolts, screws, nuts, washers, rivets, studs, etc.
- Standards documents and collections from the top aerospace & aviation standards development organizations, including SAE International, AIAA, AIA, FAA and NASA.
|