| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 3.03 MB | Adobe PDF |
Autores
Resumo(s)
Descritores de propriedades são parâmetros quantitativos extraídos de imagens médicas, os quais têm-se demonstrado promissores no desenvolvimento de modelos preditivos de machine learning (ML) aplicados ao apoio à decisão clínica. Contudo, a reprodutibilidade dos descritores, a elevada dimensionalidade dos dados, a seleção de recursos robustos e a generalização dos modelos constituem desafios metodológicos relevantes. Este estudo avaliou o impacto de diferentes metodologias de seleção de descritores radiómicos na construção de modelos de ML para a classificação de lesões prostáticas. Para tal, foi utilizado o dataset PROSTATEx, disponibilizado pela base de dados The Cancer Imaging Archive, composto por exames retrospetivos de multiparametric magnetic resonance imaging (mpMRI), em sequência T2, constituído por 204 sujeitos e 299 lesões segmentadas. Foram extraídos descritores de primeira-ordem, textura e forma, obtidos a partir da imagem original e de imagens derivadas de filtros. Consideraram-se dois conjuntos de dados: original e randomizado (obtido pela randomização da intensidade dos píxeis). As análises estatísticas incluíram: correlação Spearman e coeficiente de correlação intraclasse (ICC). Combinação dos resultados obtidos, através de operações de interseção e união, e análise univariada, para identificar descritores com maior significância com a variável a prever. Posteriormente, foram desenvolvidos quatro modelos de ML, cada um baseado numa das metodologias de seleção de descritores, comparados através de métricas de desempenho. Os resultados mostraram que a correlação Spearman identificou 1009 descritores não redundantes tendo sido eliminados 386, enquanto o ICC considerou 789 descritores informativos, com eliminação de 620. A união dos conjuntos resultou em 1154 descritores, enquanto a interseção obteve 650. Relativamente à análise univariada, nenhum descritor foi estatisticamente significativo. Este estudo demonstra que a aplicação de métodos de seleção de descritores informativos pode contribuir para melhorar o desempenho dos modelos de ML no contexto da classificação de lesões prostáticas.
Abstract Feature descriptors are quantitative parameters extracted from medical images, which have shown promise in the development of machine learning (ML) predictive models to support clinical decision-making. However, the reproducibility of descriptors, the high dimensionality of the data, the selection of robust features, and the generalizability of models represent significant methodological challenges. This study evaluated the impact of different radiomic feature selection methodologies on the construction of ML models for the classification of prostate lesions. For this purpose, the PROSTATEx dataset, made available through The Cancer Imaging Archive, was used, consisting of retrospective multiparametric magnetic resonance imaging (mpMRI) examinations in T2-weighted sequences from 204 subjects and 299 segmented lesions. First-order, texture, and shape descriptors were extracted from both the original image and filter-derived images. Two datasets were considered: original and randomized (generated by pixel intensity randomization). Statistical analyses included Spearman correlation and the intraclass correlation coefficient (ICC). The results from these methods were combined through intersection and union operations, along with univariate analysis, to identify descriptors with the greatest significance in relation to the target variable. Subsequently, four ML models were developed, each based on one of the feature selection methodologies, and compared using performance metrics. The results showed that Spearman correlation identified 1009 non-redundant descriptors, with 386 eliminated, whereas ICC retained 789 informative descriptors, eliminating 620. The union of the sets resulted in 1154 descriptors, while the intersection yielded 650. With regard to univariate analysis, no descriptor reached statistical significance. This study demonstrates that the application of informative feature selection methods can contribute to improving the performance of ML models in the context of prostate lesion classification.
Abstract Feature descriptors are quantitative parameters extracted from medical images, which have shown promise in the development of machine learning (ML) predictive models to support clinical decision-making. However, the reproducibility of descriptors, the high dimensionality of the data, the selection of robust features, and the generalizability of models represent significant methodological challenges. This study evaluated the impact of different radiomic feature selection methodologies on the construction of ML models for the classification of prostate lesions. For this purpose, the PROSTATEx dataset, made available through The Cancer Imaging Archive, was used, consisting of retrospective multiparametric magnetic resonance imaging (mpMRI) examinations in T2-weighted sequences from 204 subjects and 299 segmented lesions. First-order, texture, and shape descriptors were extracted from both the original image and filter-derived images. Two datasets were considered: original and randomized (generated by pixel intensity randomization). Statistical analyses included Spearman correlation and the intraclass correlation coefficient (ICC). The results from these methods were combined through intersection and union operations, along with univariate analysis, to identify descriptors with the greatest significance in relation to the target variable. Subsequently, four ML models were developed, each based on one of the feature selection methodologies, and compared using performance metrics. The results showed that Spearman correlation identified 1009 non-redundant descriptors, with 386 eliminated, whereas ICC retained 789 informative descriptors, eliminating 620. The union of the sets resulted in 1154 descriptors, while the intersection yielded 650. With regard to univariate analysis, no descriptor reached statistical significance. This study demonstrates that the application of informative feature selection methods can contribute to improving the performance of ML models in the context of prostate lesion classification.
Descrição
Palavras-chave
Radiómica Métodos seleção de variáveis modelos de machine learning Radiomics Feature selection methods Machine learning models
