Loading...
13 results
Search Results
Now showing 1 - 10 of 13
- Voice Pathologies Identification Speech signals, features and classifiers evaluationPublication . Cordeiro, Hugo; Fonseca, José; Guimarães, Isabel; Meneses, CarlosVoice pathology identification using speech processing methods can be used as a preliminary diagnosis. This study implements a set of identification systems to screen voice pathologies using voice signal features from the sustained vowel /a/ and continuous speech. The two signals tasks are evaluated using three acoustic features applied to four classifiers. Three main classes are identified: physiological disorders; neuromuscular disorders; and healthy subjects. The main objective of this work is to evaluate which voice signal is more reliable for voice pathology diagnosis, which acoustic feature has more pathology information and which is the best classifier to carry out this task. The best overall system accuracy is 77.9%, obtained with Mel-Line Spectrum Frequencies (MLSF) feature extracted from continuous speech and applied to a Gaussian Mixture Models (GMM) classifier.
- Caffeine has a dual influence on NMDA receptor-mediated glutamatergic transmission at the hippocampusPublication . Silva Martins, Robertta; D.M., Rombo; Ribeiro, Joana; Meneses, Carlos; Peralva Borges Martins, Vladimir Pedro; Ribeiro, Joaquim A.; Vaz, Sandra H.; Cussa Kubrusly, Regina Celia; Sebastião, Ana MCaffeine, a stimulant largely consumed around the world, is a non-selective adenosine receptor antagonist, and therefore caffeine actions at synapses usually, but not always, mirror those of adenosine. Importantly, different adenosine receptors with opposing regulatory actions co-exist at synapses. Through both inhibitory and excitatory high-affinity receptors (A(1)R and A(2)R, respectively), adenosine affects NMDA receptor (NMDAR) function at the hippocampus, but surprisingly, there is a lack of knowledge on the effects of caffeine upon this ionotropic glutamatergic receptor deeply involved in both positive (plasticity) and negative (excitotoxicity) synaptic actions. We thus aimed to elucidate the effects of caffeine upon NMDAR-mediated excitatory post-synaptic currents (NMDAR-EPSCs), and its implications upon neuronal Ca(2+)homeostasis. We found that caffeine (30-200 mu M) facilitates NMDAR-EPSCs on pyramidal CA1 neurons from Balbc/ByJ male mice, an action mimicked, as well as occluded, by 1,3-dipropyl-cyclopentylxantine (DPCPX, 50 nM), thus likely mediated by blockade of inhibitory A(1)Rs. This action of caffeine cannot be attributed to a pre-synaptic facilitation of transmission because caffeine even increased paired-pulse facilitation of NMDA-EPSCs, indicative of an inhibition of neurotransmitter release. Adenosine A(2A)Rs are involved in this likely pre-synaptic action since the effect of caffeine was mimicked by the A(2A)R antagonist, SCH58261 (50 nM). Furthermore, caffeine increased the frequency of Ca(2+)transients in neuronal cell culture, an action mimicked by the A(1)R antagonist, DPCPX, and prevented by NMDAR blockade with AP5 (50 mu M). Altogether, these results show for the first time an influence of caffeine on NMDA receptor activity at the hippocampus, with impact in neuronal Ca(2+)homeostasis.
- Adaptive predictive coding speech coding techniques applied to electrocardiogram signalsPublication . Silva, Daniel; Martins, Guilherme; Lourenço, André; Meneses, CarlosThis paper describes a lossy ECG signal coder with an adaptive predictive coding scheme initially proposed for speech coders. The predictors include linear predictive coding that takes advantage of the correlation between consecutive samples and long-term predictor that takes advantage of the signal quasi-periodicity. The prediction residue, with less dynamic range and therefore able to be encoded with less bits than the original, is transmitted sample by sample. The prediction coefficients and the amplitude of the residue are transmitted once for each heartbeat, with a negligible number of bits compared to the total bit rate. The long-term predictor is shown to obtain reliable performance when the heart rate does not change rapidly. Linear predictive coding, on the contrary, is more reliable and presents better prediction gain. The best developed coder uses double prediction and with 45% compression ratio allows a prediction gain of 24.8 dB.
- Parâmetros espectrais de vozes saudáveis e patológicas: Comparação de resultados entre duas base de dadosPublication . Cordeiro, Hugo; Meneses, CarlosEste artigo apresenta um estudo comparativo entre três parâmetros espectrais na discriminação entre vozes saudáveis e patológicas. Os parâmetros avaliados envolvem a análise do primeiro pico espectral, uma medida da relação sinal-ruído e o declive entre duas bandas de baixa frequência do sinal de fala. O declive entre as bandas de baixa frequência é proposto como otimização do primeiro pico espectral, de modo a colmatar os casos de erro de classificação devido à degradação da qualidade vocal com o avanço da doença. Os três parâmetros são avaliados em duas bases de dados. O declive entre as bandas de baixa frequência obtém os melhores resultados, com 100% de acurácia na base de dados da USP e 83,5% de acurácia na base de dados da MEEI. This paper presents a comparative study between three spectral parameters in the discrimination between healthy and pathological voices. The evaluated parameters involve the analysis of the first spectral peak, the Relative Power of the Periodic Component, which corresponds to a measure of the signal-to-noise ratio and the Low Band Spectral Tilt. The Low Band Spectral Tilt is proposed as optimization of the first spectral peak, to resolve the cases of classification error due to the degradation of vocal quality with the disease progression. The three parameters are evaluated in two databases. The Low Band Spectral Tilt achieves the best results, with 100% accuracy in the USP database and 83.5% accuracy in the MEEI database.
- Voice spectrum energy band and tilt analysis for Bulbar ALS screeningPublication . Cordeiro, Hugo; Meneses, CarlosThis paper presents a comparison between the spectral analysis of the voices from patients diagnosed with bulbar amyotrophic lateral sclerosis (ALS) and healthy speakers. The main objective is to understand how this disease affects the voice and, consequently, the spectrum of the patients' voices. The spectrum is analysed in three energy bands, where energy peaks are estimated and the tilt between these bands is computed. The results obtained allow to conclude that the subjects diagnosed with bulbar ALS present significant differences in the mean values and variances of the energy peaks in this bands compared to the healthy speakers and, consequently, in the tilt between bands. The method presented has 85% classification rate without resorting to highly complex classifiers.
- LAFA – Laboratório de FalaPublication . Meneses, CarlosLAFA – LAboratório de FAla é uma aplicação gráfica de análise de sinais de fala, desenvolvida na plataforma MATLAB, com objectivos pedagógicos no âmbito de uma disciplina de “Processamento Digital de Fala”, de modo a assistir os alunos que se iniciam nesta área, mas também auxiliar na exposição de conceitos em sala de aula. No LAFA é possível analisar diversas características do sinal de fala: detecção de vozeamento e estimação da frequência fundamental; predição linear; estimação dos coeficientes LSF; estimação de formantes; e análise cepstral, incluindo mel-cepstra e cepstra de predição linear. Permite ainda, embora de um modo rudimentar, reconhecer vogais e sintetizar sons sustentados. O sinal de entrada pode estar gravado em formato wav ou ser adquirido do microfone.
- Deteção de ELA através de análise espectral: novo parâmetro e otimizaçõesPublication . Cordeiro, Hugo; Meneses, CarlosA análise da média e desvio padrão de parâmetros baseados no declive espectral dos sinais de fala demonstrou em trabalhos anteriores capacidade de discriminação entre oradores diagnosticados com Esclerose Lateral Amiotrófica (ELA) Bulbar e oradores saudáveis. Este trabalho continua a investigação do espectro do sinal de fala neste contexto e apresenta uma solução que funde os parâmetros apresentados anteriormente e implementa um método para deteção de outliers através da análise da dispersão dos parâmetros. Apesar do novo parâmetro apresentado apresentar taxas de discriminação em linha com as anteriores, pode ser uma mais-valia no diagnóstico de outras patologias associadas à voz. Por outro lado, o método para deteção de outliers apresenta potencial na deteção de falsos positivos, uma melhoria dos resultados obtidos. Todos os parâmetros apresentados são de fácil análise e interpretação visual esperando como isso potenciar a análise de espectros em situações de rastreio não invasivas. Mean and variance analysis of the standard deviation of parameters based on the spectral slope of the speech signals demonstrated in previous works the ability to discriminate between speakers diagnosed with Amyotrophic Lateral Sclerosis (ALS) Bulbar and healthy speakers. This work continues the research of the speech signal spectrum and presents a solution that merges the parameters previous presented and implements a method for detecting outliers by analyzing the dispersion of the parameters. Although the new parameter presents discrimination rates in line with the previous ones, it can be an asset in screening other pathologies associated with the voice. On the other hand, the method for detecting outliers has potential to detect false positives, an improvement in the results obtained. All the parameters presented are easy to analyze and interpret visually, hoping to enhance the analysis of spectra in non-invasive screening situations.
- Spectral features of healthy and pathological voices: results comparison between two databasesPublication . Cordeiro, Hugo; Meneses, CarlosThis paper presents a comparative study between three spectral parameters in the discrimination between healthy and pathological voices. The evaluated parameters involve the analysis of the first spectral peak, the Relative Power of the Periodic Component, which corresponds to a measure of the signal-to-noise ratio and the Low Band Spectral Tilt. The Low Band Spectral Tilt is proposed as optimization of the first spectral peak, to resolve the cases of classification error due to the degradation of vocal quality with the disease progression. The three parameters are evaluated in two databases. The Low Band Spectral Tilt achieves the best results, with 100% accuracy in the USP database and 83.5% accuracy in the MEEI database.
- Low band continuous speech system for voice pathologies identificationPublication . Cordeiro, Hugo; Meneses, CarlosThis paper describes the impact of the signal bandwidth reduction in the identification of voice pathologies. The implemented systems evaluate the identification of 3 classes divided by healthy subjects, subjects diagnosed with physiological larynx pathologies and subjects diagnosed with neuromuscular larynx pathologies. Continuous speech signals are down-sampled to 4 kHz and the extracted spectral parameters are applied to a GMM classifier. No significant change in accuracy occurs, being possible to conclude that the low frequencies contain sufficient information to allow the classification of pathologies. A second objective is to test the effects of suppressing the voice activity detection and the increasing the analysis window length. In both cases the accuracy increases. In conclusion, a pathological voice identification system based on signals sampled at 4 kHz, without voice activity detection and with an analysis window length of 40 ms is proposed, getting 81.8% accuracy. The proposed system has also the advantage of reduces the storage memory and the processing time.
- LSF na verificação de oradorPublication . Cordeiro, Hugo; Meneses, CarlosEste artigo descreve um sistema de verificação de orador. Pretende-se despertar para possíveis alternativas aos métodos tradicionalmente utilizados no reconhecimento de orador. Os oradores são caracterizados através dos coeficientes LSF e os resultados são comparados como os tradicionais coeficientes MFCC. Nos resultados obtidos verifica-se um desempenho semelhante entre os coeficientes LSF, agora propostos, e os MFCC. O método de classificação implementado é o SVM, sendo o corpora utilizado o “2002 NIST Speaker Recognition Evaluation Corpus”.