Huge amounts of data are collected every millisecond all around the world. This ranges from images and videos to an increasing amount of sensor data. Thus, it gets difficult for humans to decide on the most important features anymore. But reducing the feature vector is an important and necessary task to achieve higher precision in classification tasks. Detecting anomalies and classifying data points is crucial for a variety of objectives in many domains. Therefore, this work focuses on feature selection for binary decision problems (e.g. anomaly detection, binary classification). We propose a novel graph-based feature selection filter, which takes into account both the importance and correlation of features at the same time. The graph-based feature selection filter recommends a subset by applying a rating function onto the maximal cliques of the graph. The evaluation is based on a comparison of the accuracy of multiple machine learning algorithms and datasets between different baseline feature selection approaches and the proposed approach. Results show that the proposed approach delivers the highest accuracy in about 69% of the cases compared to existing approaches, while reducing the number of features.
2019