Logo do repositório
 
Miniatura indisponível
Publicação

Exploiting the bin-class histograms for feature selection on discrete data

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
Exploiting the Bin-Class Histograms for Feature.pdf436.82 KBAdobe PDF Ver/Abrir

Orientador(es)

Resumo(s)

In machine learning and pattern recognition tasks, the use of feature discretization techniques may have several advantages. The discretized features may hold enough information for the learning task at hand, while ignoring minor fluctuations that are irrelevant or harmful for that task. The discretized features have more compact representations that may yield both better accuracy and lower training time, as compared to the use of the original features. However, in many cases, mainly with medium and high-dimensional data, the large number of features usually implies that there is some redundancy among them. Thus, we may further apply feature selection (FS) techniques on the discrete data, keeping the most relevant features, while discarding the irrelevant and redundant ones. In this paper, we propose relevance and redundancy criteria for supervised feature selection techniques on discrete data. These criteria are applied to the bin-class histograms of the discrete features. The experimental results, on public benchmark data, show that the proposed criteria can achieve better accuracy than widely used relevance and redundancy criteria, such as mutual information and the Fisher ratio.

Descrição

Palavras-chave

Feature selection Feature discretization Discrete features Bin-class histogram Matrix norm Supervised learning Classification

Contexto Educativo

Citação

FERREIRA, Artur J.; FIGUEIREDO, Mário A. T. - Exploiting the Bin-Class Histograms for Feature Selection on Discrete Data. In 7th Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA). Santiago de Compostela: SPRINGER-VERLAG BERLIN, 2015. ISBN. 978-3-319-19390-8. Vol. 9117, pp. 345-353

Projetos de investigação

Unidades organizacionais

Fascículo

Editora

Springer-Verlag Berlin

Licença CC

Métricas Alternativas