Research Article

Improving cancer diseases classification using a hybrid filter and wrapper feature subset selection

Noor Muhammed Noori* and Omar Saber Qasim

Published: 02/11/2020 | Volume 4 - Issue 1 | Pages: 006-012

Abstract

In the classification of cancer data sets, we note that they contain a number of additional features that influence the classification accuracy. There are many evolutionary algorithms that are used to define the feature and reduce dimensional patterns such as the gray wolf algorithm (GWO) after converting it from a continuous space to a discrete space. In this paper, a method of feature selection was proposed through two consecutive stages in the first stage, the fuzzy mutual information (FMI) technique is used to determine the most important feature selection of diseases dataset through a fuzzy model that was built based on the data size. In the second stage, the binary gray wolf optimization (BGWO) algorithm is used to determine a specific number of features affecting the process of classification, which came from the first stage. The proposed algorithm, FMI_BGWO, describes efficiency and effectiveness by obtaining a higher classification accuracy and a small number of selected genes compared to other competitor algorithms.

Read Full Article HTML DOI: 10.29328/journal.apb.1001010 Cite this Article

References

  1. Estévez PA, Tesmer M, Perez CA, Zurada JM. Normalized mutual information feature selection. IEEE Trans. 2009; 20: 189–201.
  2. Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Adv Eng Softw. 2014; 69: 46–61.
  3. Emary E, Zawbaa HM, Hassanien AE. Binary grey wolf optimization approaches for feature selection. Neurocomputing. 2016; 172: 371–381.
  4. Al-thanoon NA, Saber O, Yahya Z. Tuning parameter estimation in SCAD-support vector machine using firefly algorithm with application in gene selection and cancer classification. Comput Biol Med. 2018; 103: 262–268. PubMed: https://www.ncbi.nlm.nih.gov/pubmed/30399534
  5. Zaffar M, Iskander S, Hashmani MA. A study of feature selection algorithms for predicting students academic performance. Int J Adv Comput Sci Appl. 2018; 9: 541–549.
  6. Al-thanoon NA, Qasim OS, Algamal ZY. Selection of Tuning Parameter in L1-Support Vector Machine via. Particle Swarm Optimization Method. 2020; 15: 310–318.
  7. Singh N, Hachimi H. A new hybrid whale optimizer algorithm with mean strategy of grey wolf optimizer for global optimization. Math Comput Appl. 2018; 23: 14.
  8. Madadi A, Motlagh MM. Optimal control of DC motor using grey wolf optimizer algorithm. Tech J Eng Appl Sci. 2014; 4: 373–379.
  9. Emary E, Zawbaa HM, Grosan C. Experienced gray wolf optimization through reinforcement learning and neural networks. IEEE Trans Neural Netw Learn Syst. 2017; 29: 681–694. PubMed: https://www.ncbi.nlm.nih.gov/pubmed/28092578
  10. Manikandan SP, Manimegalai R, Hariharan M. Gene Selection from microarray data using binary grey wolf algorithm for classifying acute leukemia. Curr Signal Transduct Ther. 2016; 11: 76–83.
  11. Liu B, Blasch E, Chen Y, Shen D, Chen G. Scalable sentiment classification for big data analysis using naive bayes classifier. IEEE International Conference. 2013; 99–104.
  12. Chen J, Huang H, Tian S, Qu Y. Feature selection for text classification with Naïve Bayes. Expert Syst Appl. 2009; 36: 5432–5435.
  13. Ghamisi P, Benediktsson JA. Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci Remote Sens Lett. 2014; 12: 309–313.
  14. Alhafedh MAA, Qasim OS. Two-Stage Gene Selection in Microarray Dataset Using Fuzzy Mutual Information and Binary Particle Swarm Optimization. Indian J Forensic Med Toxicol. 2019; 13: 1162–1171.
  15. Karegowda AG, Jayaram MA, Manjunath AS. Feature subset selection problem using wrapper approach in supervised learning. Int J Comput Appl. 2010; 1: 13–17.
  16. Chuang LY, Chang HW, Tu CJ, Yang CH. Improved binary PSO for feature selection using gene expression data. Comput Biol Chem. 2008; 32: 29–38.
  17. Al-thanoon NA, Saber O, Yahya Z. Chemometrics and Intelligent Laboratory Systems A new hybrid fi re fl y algorithm and particle swarm optimization for tuning parameter estimation in penalized support vector machine with application in chemometrics. Chemom Intell Lab Syst. 2019; 184: 142–152.
  18. Zhao M, Fu C, Ji L, Tang K, Zhou M. Feature selection and parameter optimization for support vector machines: A new approach based on genetic algorithm with feature chromosomes. Expert Syst Appl. 2011; 38: 5197–5204.
  19. Zhang JR, Zhang J, Lok TM, Lyu MR. A hybrid particle swarm optimization–back-propagation algorithm for feedforward neural network training. Appl Math Comput. 2007; 185: 1026–1037.
  20. Blake CL, Merz CJ. UCI repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine, CA. 1998.