Feature selection is considered as one of the most important data pre-processing step in different modelling fields, especially for prediction and classification purposes. Feature selection belongs to the wider class of data mining procedures, as it allows to discover the variables that mostly affect a given phenomenon from an analysis of the available data, by thus increasing the knowledge of the considered process or phenomenon. There are three main categories of feature selection approaches, namely filter, wrappers and embedded methods: this work is focused on the first one and, in particular, on a fuzzy logic-based procedure which combines some traditional filter methods. Filter methods exploit intrinsic properties of the data to select the features before the learning task and, with respect to the other kinds of approaches, require a shorter computational time and adequate for datasets with a large number of instances and features. In order to prove the effectiveness of the proposed approach, several tests have been performed. Different classifiers have been designed and applied for binary classification on different datasets: some widely used public datasets including a lot of instances and features and two datasets coming from the metal industry. The obtained results are presented and discussed in the paper.

A fuzzy system for combining filter features selection methods

CATENI, Silvia
;
COLLA, Valentina;VANNUCCI, Marco
2017-01-01

Abstract

Feature selection is considered as one of the most important data pre-processing step in different modelling fields, especially for prediction and classification purposes. Feature selection belongs to the wider class of data mining procedures, as it allows to discover the variables that mostly affect a given phenomenon from an analysis of the available data, by thus increasing the knowledge of the considered process or phenomenon. There are three main categories of feature selection approaches, namely filter, wrappers and embedded methods: this work is focused on the first one and, in particular, on a fuzzy logic-based procedure which combines some traditional filter methods. Filter methods exploit intrinsic properties of the data to select the features before the learning task and, with respect to the other kinds of approaches, require a shorter computational time and adequate for datasets with a large number of instances and features. In order to prove the effectiveness of the proposed approach, several tests have been performed. Different classifiers have been designed and applied for binary classification on different datasets: some widely used public datasets including a lot of instances and features and two datasets coming from the metal industry. The obtained results are presented and discussed in the paper.
2017
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11382/509393
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 28
social impact