A Verified Technique for Colon Cancer Analysis with Minimum Number of Features
Journal: International Journal of Scientific Engineering and Research (IJSER) (Vol.3, No. 8)Publication Date: 2015-08-05
Authors : Mohammed A. El-Shrkawey; Ben Bella S. Tawfik;
Page : 77-79
Keywords : Features selection; QDA classifier; gene expression; t-test; and p-value;
Abstract
Gene expression data is characterized by high dimensionality and small number of samples. Many researches work in data reduction, in other words selecting the most influence features (features selection). This work differs in verifying each step of selection; also, it reaches smaller number of features with high discrimination. Reducing data dimensionality lead to effective analysis of gene features. Actually, there is a tradeoff between feature selection and acceptable accuracy. The target is to find the compact set of features used for knowledge discovery and acceptable accuracy. So, we present a novel framework which integrates dimensionality reduction with classification for gene expression data analysis. In order to achieve our objective, we will use Oligonucleotide arrays. It provides a broad picture of the cell state by monitoring the expression level of thousands of genes at the same time. The developed techniques make to extract useful information from the resulting data sets. Gene expression is analyzed using 40 tumor and 22 normal colon tissue samples with 2000 human genes. The first phase of preprocessing, the introduced data is arranged and normalized. The second phase performs the features reduction in two steps. First step implements the features reduction from 2000 to 602 using t-test (lowest p-value). Second step, the reduction is implemented using sequential forward correlation which comes with only three gene features. With these only three genes a quadratic classification is done to test the features significance. The result of these classification attempt more than 96% of success.
Other Latest Articles
- PSR Protocol with NN-Query for Mobile Ad Hoc Networks
- Comparison between Optical XOR Gate with and Without Additional Input Beam
- Saliency Based Content-Aware Image Retargeting
- A Review on OCCT: A One Class Clustering Tree (OCCT) for Implementing One-to-Many Data Linkage
- Characteristics and Trends Needed for Personality Development for Manifold Spheres of Management
Last modified: 2021-07-08 15:26:54