Hybridizing Filters and Wrapper Approaches for Improving the Classification Accuracy of Microarray Dataset
Ahmed Soufi Abou-Taleb1, Ahmed Ahmed Mohamed2, Osama Abdo Mohamed3, Amr Hassan Abedelhalim4
1Ahmed Soufi Abou-Taleb, biomedical Engineering and systems Department, Faculty of Engineering Cairo University, Cairo, Egypt.
2Ahmed Ahmed Mohamed, Mathematics Department, Faculty of Science, Zagazig University, Zagazig, Egypt.
3Osama Abdo Mohamed Mathematics & computer science Department, Faculty of Science, Zagazig University, Zagazig, Egypt.
4Amr Hassan Abedelhalim, Mathematics & computer science Department, Faculty of Science, Zagazig University, Zagazig, Egypt.
Manuscript received on June 03, 2013. | Revised Manuscript received on June 28, 2013. | Manuscript published on July 05, 2013. | PP: 155-159 | Volume-3 Issue-3, July 2013. | Retrieval Number: C1685073313/2013©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Feature selection aims at finding the most relevant features of a problem domain. However, identification of useful features from hundreds or even thousands of related features is a nontrivial task. This paper is aimed at identifying a small set of genes, to improving computational speed and prediction accuracy; hence we have proposed a three-stage of gene selection algorithm for microarray data. The proposed approach combines information gain (IG), Significance Analysis for Microarrays (SAM), mRMR (Minimum Redundancy Maximum Relevance) and Support Vector Machine Recursive Feature Elimination (SVM-RFE). In the first stage, intersection part of feature sets is identified by applying the (SAM–IG). While, the second minimizes the redundancy with the help of mRMR method, which facilitates the selection of effectual gene subset from intersection part that recommended from the first stage. In the third stage, (SVM-RFE) is applied to choose the most discriminating genes. We evaluated our technique on AML and ALL (leukemia) dataset using Support Vector Machines (SVMRBF) classifier, and show the potentiality of the proposed method with the advantage of improving the classification performance.
Keywords: Feature selection, Filters, Wrappers, Support vector machine, Microarray.