Loading...
6 results
Search Results
Now showing 1 - 6 of 6
- Voice Pathologies Identification Speech signals, features and classifiers evaluationPublication . Cordeiro, Hugo; Fonseca, José; Guimarães, Isabel; Meneses, CarlosVoice pathology identification using speech processing methods can be used as a preliminary diagnosis. This study implements a set of identification systems to screen voice pathologies using voice signal features from the sustained vowel /a/ and continuous speech. The two signals tasks are evaluated using three acoustic features applied to four classifiers. Three main classes are identified: physiological disorders; neuromuscular disorders; and healthy subjects. The main objective of this work is to evaluate which voice signal is more reliable for voice pathology diagnosis, which acoustic feature has more pathology information and which is the best classifier to carry out this task. The best overall system accuracy is 77.9%, obtained with Mel-Line Spectrum Frequencies (MLSF) feature extracted from continuous speech and applied to a Gaussian Mixture Models (GMM) classifier.
- Parâmetros espectrais de vozes saudáveis e patológicas: Comparação de resultados entre duas base de dadosPublication . Cordeiro, Hugo; Meneses, CarlosEste artigo apresenta um estudo comparativo entre três parâmetros espectrais na discriminação entre vozes saudáveis e patológicas. Os parâmetros avaliados envolvem a análise do primeiro pico espectral, uma medida da relação sinal-ruído e o declive entre duas bandas de baixa frequência do sinal de fala. O declive entre as bandas de baixa frequência é proposto como otimização do primeiro pico espectral, de modo a colmatar os casos de erro de classificação devido à degradação da qualidade vocal com o avanço da doença. Os três parâmetros são avaliados em duas bases de dados. O declive entre as bandas de baixa frequência obtém os melhores resultados, com 100% de acurácia na base de dados da USP e 83,5% de acurácia na base de dados da MEEI. This paper presents a comparative study between three spectral parameters in the discrimination between healthy and pathological voices. The evaluated parameters involve the analysis of the first spectral peak, the Relative Power of the Periodic Component, which corresponds to a measure of the signal-to-noise ratio and the Low Band Spectral Tilt. The Low Band Spectral Tilt is proposed as optimization of the first spectral peak, to resolve the cases of classification error due to the degradation of vocal quality with the disease progression. The three parameters are evaluated in two databases. The Low Band Spectral Tilt achieves the best results, with 100% accuracy in the USP database and 83.5% accuracy in the MEEI database.
- Spectral features of healthy and pathological voices: results comparison between two databasesPublication . Cordeiro, Hugo; Meneses, CarlosThis paper presents a comparative study between three spectral parameters in the discrimination between healthy and pathological voices. The evaluated parameters involve the analysis of the first spectral peak, the Relative Power of the Periodic Component, which corresponds to a measure of the signal-to-noise ratio and the Low Band Spectral Tilt. The Low Band Spectral Tilt is proposed as optimization of the first spectral peak, to resolve the cases of classification error due to the degradation of vocal quality with the disease progression. The three parameters are evaluated in two databases. The Low Band Spectral Tilt achieves the best results, with 100% accuracy in the USP database and 83.5% accuracy in the MEEI database.
- Low band continuous speech system for voice pathologies identificationPublication . Cordeiro, Hugo; Meneses, CarlosThis paper describes the impact of the signal bandwidth reduction in the identification of voice pathologies. The implemented systems evaluate the identification of 3 classes divided by healthy subjects, subjects diagnosed with physiological larynx pathologies and subjects diagnosed with neuromuscular larynx pathologies. Continuous speech signals are down-sampled to 4 kHz and the extracted spectral parameters are applied to a GMM classifier. No significant change in accuracy occurs, being possible to conclude that the low frequencies contain sufficient information to allow the classification of pathologies. A second objective is to test the effects of suppressing the voice activity detection and the increasing the analysis window length. In both cases the accuracy increases. In conclusion, a pathological voice identification system based on signals sampled at 4 kHz, without voice activity detection and with an analysis window length of 40 ms is proposed, getting 81.8% accuracy. The proposed system has also the advantage of reduces the storage memory and the processing time.
- Hierarchical classification and system combination for automatically identifying physiological and neuromuscular laryngeal pathologiesPublication . Cordeiro, Hugo; Fonseca, Jose; Guimarães, Isabel; Meneses, CarlosObjectives. Speech signal processing techniques have provided several contributions to pathologic voice identification, in which healthy and unhealthy voice samples are evaluated. A less common approach is to identify laryngeal pathologies, for which the use of a noninvasive method for pathologic voice identification is an important step forward for preliminary diagnosis. In this study, a hierarchical classifier and a combination of systems are used to improve the accuracy of a three-class identification system (healthy, physiological larynx pathologies, and neuromuscular larynx pathologies). Method. Three main subject classes were considered: subjects with physiological larynx pathologies (vocal fold nodules and edemas: 59 samples), subjects with neuromuscular larynx pathologies (unilateral vocal fold paralysis: 59 samples), and healthy subjects (36 samples). The variables used in this study were a speech task (sustained vowel /a/ or continuous reading speech), features with or without perceptual information, and features with or without direct information about formants evaluated using single classifiers. A hierarchical classification system was designed based on this information. Results. The resulting system combines an analysis of continuous speech by way of the commonly used sustained vowel /a/ to obtain spectral and perceptual speech features. It achieved an accuracy of 84.4%, which represents an improvement of approximately 9% compared with the stand-alone approach. For pathologic voice identification, the accuracy obtained was 98.7%, and the identification accuracy for the two pathology classes was 81.3%. Conclusions. Hierarchical classification and system combination create significant benefits and introduce a modular approach to the classification of larynx pathologies.
- Low band spectral tilt analysis for pathological voice discriminationPublication . Cordeiro, Hugo; Meneses, CarlosThis paper presents a new method for discriminating between subjects with healthy voices and subjects with diseases in the vocal folds. This method uses speech signals and spectral analysis of the sustained vowel /a/. The slope between a first band of the signal defined in the first two harmonics and a second band defined in the zone of the /a/ first formant contains information that allows to correctly classify the database of pathological voices of the University of Sao Paulo. The presented method can be applied in the direct analysis of spectra or implemented in high-level classifiers as a complement to other parameters.