Loading...
10 results
Search Results
Now showing 1 - 10 of 10
- Voice Pathologies Identification Speech signals, features and classifiers evaluationPublication . Cordeiro, Hugo; Fonseca, José; Guimarães, Isabel; Meneses, CarlosVoice pathology identification using speech processing methods can be used as a preliminary diagnosis. This study implements a set of identification systems to screen voice pathologies using voice signal features from the sustained vowel /a/ and continuous speech. The two signals tasks are evaluated using three acoustic features applied to four classifiers. Three main classes are identified: physiological disorders; neuromuscular disorders; and healthy subjects. The main objective of this work is to evaluate which voice signal is more reliable for voice pathology diagnosis, which acoustic feature has more pathology information and which is the best classifier to carry out this task. The best overall system accuracy is 77.9%, obtained with Mel-Line Spectrum Frequencies (MLSF) feature extracted from continuous speech and applied to a Gaussian Mixture Models (GMM) classifier.
- Parâmetros espectrais de vozes saudáveis e patológicas: Comparação de resultados entre duas base de dadosPublication . Cordeiro, Hugo; Meneses, CarlosEste artigo apresenta um estudo comparativo entre três parâmetros espectrais na discriminação entre vozes saudáveis e patológicas. Os parâmetros avaliados envolvem a análise do primeiro pico espectral, uma medida da relação sinal-ruído e o declive entre duas bandas de baixa frequência do sinal de fala. O declive entre as bandas de baixa frequência é proposto como otimização do primeiro pico espectral, de modo a colmatar os casos de erro de classificação devido à degradação da qualidade vocal com o avanço da doença. Os três parâmetros são avaliados em duas bases de dados. O declive entre as bandas de baixa frequência obtém os melhores resultados, com 100% de acurácia na base de dados da USP e 83,5% de acurácia na base de dados da MEEI. This paper presents a comparative study between three spectral parameters in the discrimination between healthy and pathological voices. The evaluated parameters involve the analysis of the first spectral peak, the Relative Power of the Periodic Component, which corresponds to a measure of the signal-to-noise ratio and the Low Band Spectral Tilt. The Low Band Spectral Tilt is proposed as optimization of the first spectral peak, to resolve the cases of classification error due to the degradation of vocal quality with the disease progression. The three parameters are evaluated in two databases. The Low Band Spectral Tilt achieves the best results, with 100% accuracy in the USP database and 83.5% accuracy in the MEEI database.
- Voice spectrum energy band and tilt analysis for Bulbar ALS screeningPublication . Cordeiro, Hugo; Meneses, CarlosThis paper presents a comparison between the spectral analysis of the voices from patients diagnosed with bulbar amyotrophic lateral sclerosis (ALS) and healthy speakers. The main objective is to understand how this disease affects the voice and, consequently, the spectrum of the patients' voices. The spectrum is analysed in three energy bands, where energy peaks are estimated and the tilt between these bands is computed. The results obtained allow to conclude that the subjects diagnosed with bulbar ALS present significant differences in the mean values and variances of the energy peaks in this bands compared to the healthy speakers and, consequently, in the tilt between bands. The method presented has 85% classification rate without resorting to highly complex classifiers.
- Deteção de ELA através de análise espectral: novo parâmetro e otimizaçõesPublication . Cordeiro, Hugo; Meneses, CarlosA análise da média e desvio padrão de parâmetros baseados no declive espectral dos sinais de fala demonstrou em trabalhos anteriores capacidade de discriminação entre oradores diagnosticados com Esclerose Lateral Amiotrófica (ELA) Bulbar e oradores saudáveis. Este trabalho continua a investigação do espectro do sinal de fala neste contexto e apresenta uma solução que funde os parâmetros apresentados anteriormente e implementa um método para deteção de outliers através da análise da dispersão dos parâmetros. Apesar do novo parâmetro apresentado apresentar taxas de discriminação em linha com as anteriores, pode ser uma mais-valia no diagnóstico de outras patologias associadas à voz. Por outro lado, o método para deteção de outliers apresenta potencial na deteção de falsos positivos, uma melhoria dos resultados obtidos. Todos os parâmetros apresentados são de fácil análise e interpretação visual esperando como isso potenciar a análise de espectros em situações de rastreio não invasivas. Mean and variance analysis of the standard deviation of parameters based on the spectral slope of the speech signals demonstrated in previous works the ability to discriminate between speakers diagnosed with Amyotrophic Lateral Sclerosis (ALS) Bulbar and healthy speakers. This work continues the research of the speech signal spectrum and presents a solution that merges the parameters previous presented and implements a method for detecting outliers by analyzing the dispersion of the parameters. Although the new parameter presents discrimination rates in line with the previous ones, it can be an asset in screening other pathologies associated with the voice. On the other hand, the method for detecting outliers has potential to detect false positives, an improvement in the results obtained. All the parameters presented are easy to analyze and interpret visually, hoping to enhance the analysis of spectra in non-invasive screening situations.
- Spectral features of healthy and pathological voices: results comparison between two databasesPublication . Cordeiro, Hugo; Meneses, CarlosThis paper presents a comparative study between three spectral parameters in the discrimination between healthy and pathological voices. The evaluated parameters involve the analysis of the first spectral peak, the Relative Power of the Periodic Component, which corresponds to a measure of the signal-to-noise ratio and the Low Band Spectral Tilt. The Low Band Spectral Tilt is proposed as optimization of the first spectral peak, to resolve the cases of classification error due to the degradation of vocal quality with the disease progression. The three parameters are evaluated in two databases. The Low Band Spectral Tilt achieves the best results, with 100% accuracy in the USP database and 83.5% accuracy in the MEEI database.
- Low band continuous speech system for voice pathologies identificationPublication . Cordeiro, Hugo; Meneses, CarlosThis paper describes the impact of the signal bandwidth reduction in the identification of voice pathologies. The implemented systems evaluate the identification of 3 classes divided by healthy subjects, subjects diagnosed with physiological larynx pathologies and subjects diagnosed with neuromuscular larynx pathologies. Continuous speech signals are down-sampled to 4 kHz and the extracted spectral parameters are applied to a GMM classifier. No significant change in accuracy occurs, being possible to conclude that the low frequencies contain sufficient information to allow the classification of pathologies. A second objective is to test the effects of suppressing the voice activity detection and the increasing the analysis window length. In both cases the accuracy increases. In conclusion, a pathological voice identification system based on signals sampled at 4 kHz, without voice activity detection and with an analysis window length of 40 ms is proposed, getting 81.8% accuracy. The proposed system has also the advantage of reduces the storage memory and the processing time.
- LSF na verificação de oradorPublication . Cordeiro, Hugo; Meneses, CarlosEste artigo descreve um sistema de verificação de orador. Pretende-se despertar para possíveis alternativas aos métodos tradicionalmente utilizados no reconhecimento de orador. Os oradores são caracterizados através dos coeficientes LSF e os resultados são comparados como os tradicionais coeficientes MFCC. Nos resultados obtidos verifica-se um desempenho semelhante entre os coeficientes LSF, agora propostos, e os MFCC. O método de classificação implementado é o SVM, sendo o corpora utilizado o “2002 NIST Speaker Recognition Evaluation Corpus”.
- Hierarchical classification and system combination for automatically identifying physiological and neuromuscular laryngeal pathologiesPublication . Cordeiro, Hugo; Fonseca, Jose; Guimarães, Isabel; Meneses, CarlosObjectives. Speech signal processing techniques have provided several contributions to pathologic voice identification, in which healthy and unhealthy voice samples are evaluated. A less common approach is to identify laryngeal pathologies, for which the use of a noninvasive method for pathologic voice identification is an important step forward for preliminary diagnosis. In this study, a hierarchical classifier and a combination of systems are used to improve the accuracy of a three-class identification system (healthy, physiological larynx pathologies, and neuromuscular larynx pathologies). Method. Three main subject classes were considered: subjects with physiological larynx pathologies (vocal fold nodules and edemas: 59 samples), subjects with neuromuscular larynx pathologies (unilateral vocal fold paralysis: 59 samples), and healthy subjects (36 samples). The variables used in this study were a speech task (sustained vowel /a/ or continuous reading speech), features with or without perceptual information, and features with or without direct information about formants evaluated using single classifiers. A hierarchical classification system was designed based on this information. Results. The resulting system combines an analysis of continuous speech by way of the commonly used sustained vowel /a/ to obtain spectral and perceptual speech features. It achieved an accuracy of 84.4%, which represents an improvement of approximately 9% compared with the stand-alone approach. For pathologic voice identification, the accuracy obtained was 98.7%, and the identification accuracy for the two pathology classes was 81.3%. Conclusions. Hierarchical classification and system combination create significant benefits and introduce a modular approach to the classification of larynx pathologies.
- Low band spectral tilt analysis for pathological voice discriminationPublication . Cordeiro, Hugo; Meneses, CarlosThis paper presents a new method for discriminating between subjects with healthy voices and subjects with diseases in the vocal folds. This method uses speech signals and spectral analysis of the sustained vowel /a/. The slope between a first band of the signal defined in the first two harmonics and a second band defined in the zone of the /a/ first formant contains information that allows to correctly classify the database of pathological voices of the University of Sao Paulo. The presented method can be applied in the direct analysis of spectra or implemented in high-level classifiers as a complement to other parameters.
- Espectro de voz de pacientes diagnosticados com ELA bulbar: Análise do declive em bandas de energia de baixa frequênciaPublication . Cordeiro, Hugo; Meneses, CarlosEste estudo apresenta uma comparação entre a análise espetral de vozes de pacientes diagnosticados com Esclerose Lateral Amiotrófica (ELA) bulbar e oradores saudáveis. O principal objetivo é entender de que modo esta doença afeta a voz e consequentemente o espectro das vozes dos pacientes. Os espectros são analisados em duas bandas de energia, onde são caracterizados picos de energia e consequentemente também o declive entre estas bandas de energia. Os resultados obtidos permitem concluir que os sujeitos diagnosticados com ELA bulbar apresentam diferenças significativas nos valores médios e variâncias dos picos de energia em relação aos oradores saudáveis. This study presents a comparison between spectral analysis of voices of patients diagnosed with bulbar amyotrophic lateral sclerosis (ALS) and healthy speakers. The main objective is to understand how this disease affects the voice and, consequently, the spectrum of patients' voices. The spectra are analyzed in two energy bands, where energy peaks are characterized and, consequently, also the slope between these energy bands. The results obtained allow to conclude that subjects diagnosed with ALS bulbar present significant differences in the mean values and variances of the peak energy compared to the healthy speakers.