Name: | Description: | Size: | Format: | |
---|---|---|---|---|
2.36 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
Definição do problema: A prova de função respiratória (PFR) mais acessível e utilizada atualmente é a espirometria. Contudo, é incapaz de medir os volumes pulmonares estáticos, essenciais para o diagnóstico de hiperinsuflação e de restrição pulmonar. As técnicas de aprendizagem automática (ML) têm ganho notoriedade no campo da medicina em grande parte graças à sua capacidade de prever ou classificar a partir de um grande número de exemplos já conhecidos. Objetivo: Aplicar modelos supervisionados de regressão para os volumes pulmonares não mobilizáveis e de classificação automática das suas alterações, através de parâmetros biológicos, antropométricos e espirométricos, com avaliação do seu desempenho. Metodologia: Estudo retrospetivo com 8140 PFR anonimizadas contendo dados biológicos, antropométricos, espirométricos e de pletismografia corporal total (PCT). Oito modelos foram utilizados para prever o valor absoluto, z score ou classificar a alteração ventilatória presente: regressão linear (LR), regressão logística (LogR), classificador Bayesiano (NB), k-nearest neighbors (kNN), support vector machines (SVM), redes neuronais (NN), florestas de árvores de decisão (RF) e extreme gradient boosting (XGboost). Após seleção das variáveis que providenciaram mais informação, os modelos foram submetidos à técnica de reamostragem 20-fold cross validation e o seu desempenho foi avaliado de acordo com o tipo problema em questão (regressão ou classificação). Resultados: Na amostra estudada, 66% dos indivíduos apresentaram volumes pulmonares estáticos sem alterações, sendo o air trapping a alteração mais prevalente (18,5%). Para a classificação da presença de air trapping e para a classificação da alteração na PCT, o modelo com melhor desempenho foi o XGboost, com uma área abaixo da curva receiver operating characteristic (AUC) de 0,881 e 0,874, respetivamente. Em termos de regressão, o algoritmo com melhor desempenho foi a LR, em que na previsão do volume residual (RV) obteve um R2 de 0,539. Para a capacidade pulmonar total (TLC) o R2 foi de 0,856 e para o RV/TLC o R2 foi de 0,752. A previsão do z score de RV/TLC apresentou um R2 de 0,442. O empilhamento de modelos não se mostrou como uma melhor abordagem em comparação com cada modelo utilizado individualmente.Conclusão: O desempenho dos modelos de regressão relativamente ao RV e ao z score de RV/TLC apresentaram resultados insuficientes. No entanto, a regressão para a TLC, para o RV/TLC e a classificação para identificação de air trapping e da alteração na PCT tiveram resultados qualitativamente bons, demonstrando a fiabilidade da associação entre a espirometria e a ML na previsão e classificação dos volumes pulmonares não mobilizáveis.
Background: The most accessible and currently used respiratory function test (PFR) is spirometry, however it is unable to measure static lung volumes, essential to aid in the diagnosis of hyperinflation or lung restriction. Machine learning (ML) techniques have gained notoriety in the medical field largely due to their ability to predict or classify from a large number of already known examples. Aim: Apply supervised regression models for static lung volumes and automatic classification of their disorders, using biological, anthropometric and spirometric parameters, with evalutaion of model’s performance. Methodology: Retrospective study with 8140 anonymized PFRs containing biological, anthropometric, spirometric and total body plethysmography data. Eight models were used to predict the absolute value, z score or classify the ventilatory change: linear regression (LR), logistic regression (LogR), Bayesian classifier (NB), k-nearest neighbors (kNN), support vector machines (SVM), neural networks (NN), decision tree forests (RF) and extreme gradient boosting (XGboost). After selecting the variables that provided more information, the models were submitted to the resampling technique 20-fold cross validation and their performance was evaluated according to the type of problem in question (regression or classification). Results: In this sample, 66% of the individuals had static lung volumes without alterations, with air trapping being the most prevalent alteration (18.5%). For the classification of air trapping and for the classification of the plethysmographic disorder, the model with the best performance was the XGboost, with an area under the curve ROC (AUC) of 0.881 and 0.874, respectively. In terms of regression, the algorithm with the best performance was the LR, in which to predict the residual volume (RV) it obtained an R2 of 0.539. The total lung capacity (TLC) had an R2 of 0.856 and for the RV/TLC the R2 was 0.752. The RV/TLC z score prediction had an R2 of 0.442. Model stacking did not prove to be a better approach compared to each model used individually. Conclusion The performance of the regression models in relation to the RV and the z score of RV/TLC showed insufficient results. However, the regression for TLC, for RV/TLC and the classification to identify air trapping and the change in PCT had good results, demonstrating the reliability of the association between spirometry and LM in the prediction and classification of not mobilisable lung volumes.
Background: The most accessible and currently used respiratory function test (PFR) is spirometry, however it is unable to measure static lung volumes, essential to aid in the diagnosis of hyperinflation or lung restriction. Machine learning (ML) techniques have gained notoriety in the medical field largely due to their ability to predict or classify from a large number of already known examples. Aim: Apply supervised regression models for static lung volumes and automatic classification of their disorders, using biological, anthropometric and spirometric parameters, with evalutaion of model’s performance. Methodology: Retrospective study with 8140 anonymized PFRs containing biological, anthropometric, spirometric and total body plethysmography data. Eight models were used to predict the absolute value, z score or classify the ventilatory change: linear regression (LR), logistic regression (LogR), Bayesian classifier (NB), k-nearest neighbors (kNN), support vector machines (SVM), neural networks (NN), decision tree forests (RF) and extreme gradient boosting (XGboost). After selecting the variables that provided more information, the models were submitted to the resampling technique 20-fold cross validation and their performance was evaluated according to the type of problem in question (regression or classification). Results: In this sample, 66% of the individuals had static lung volumes without alterations, with air trapping being the most prevalent alteration (18.5%). For the classification of air trapping and for the classification of the plethysmographic disorder, the model with the best performance was the XGboost, with an area under the curve ROC (AUC) of 0.881 and 0.874, respectively. In terms of regression, the algorithm with the best performance was the LR, in which to predict the residual volume (RV) it obtained an R2 of 0.539. The total lung capacity (TLC) had an R2 of 0.856 and for the RV/TLC the R2 was 0.752. The RV/TLC z score prediction had an R2 of 0.442. Model stacking did not prove to be a better approach compared to each model used individually. Conclusion The performance of the regression models in relation to the RV and the z score of RV/TLC showed insufficient results. However, the regression for TLC, for RV/TLC and the classification to identify air trapping and the change in PCT had good results, demonstrating the reliability of the association between spirometry and LM in the prediction and classification of not mobilisable lung volumes.
Description
Trabalho final de Mestrado para obtenção do grau de Mestre em Engenharia Biomédica
Keywords
Provas de função respiratória Espirometria forçada Pletismografia corporal total Aprendizagem automática Volumes pulmonares estáticos Hiperinsuflação Restrição Respiratory function tests Forced spirometry Whole body plethysmography Machine learning Static lung volumes Hyperinflation Restriction
Citation
PEREIRA, Marco António Pacheco - Previsão de volumes pulmonares não mobilizáveis com base em parâmetros espirométricos. Lisboa: Instituto Superior de Engenharia de Lisboa, 2022. Dissertação de Mestrado.