ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

PERCENTAGE CONCENTRATION OF NUCLEOTIDES IN GENOME DATA OF SARS - CORONA VIRUSES

Journal: International Journal of Advanced Research in Engineering and Technology (IJARET) (Vol.12, No. 02)

Publication Date:

Authors : ;

Page : 411-421

Keywords : SARS- Corona Viruses; Genome; Machine Learning Algorithms;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

As the era of big data is coming, many genomic consortia are generating an enormous amount of data to characterize the functional roles of genetic variants and these data are widely available to the public. To reveal novel genomic insights from this data within a reasonable timeframe, traditional data analysis methods may not be sufficient or scalable, forcing the need for big data analytics to be developed for genomics. Genome data of fifty SARS -Corona Virus are analyzed for finding out common feature among them. A novel feature called “Percentage Concentration of Nucleotides” denoted as pA, pT, pG and pC are evaluated for each genome data and cross verified with other data whether all of them possess the same genetic features or not. Adjoints of a genome data are four independent binary sequences corresponding to the nucleotides of adenine, thymine, guanine and cytosine. For example, the adjoint of adenine of a genome sequence is a binary sequence consisting of 1's in the place of adenine in the genome sequence and 0's in all other places. The adjoint of thymine of a genome sequence is a binary sequence consisting of 1's in the place of thymine in the genome sequence and 0's in all other places. The adjoint of guanine of a genome sequence is a binary sequence consisting of 1's in the place of guanine in the genome sequence and 0's in all other places. The adjoint of cytosine of a genome sequence is a binary sequence consisting of 1's in the place of cytosine in the genome sequence and 0's in all other places. Adjoint arrays of all nine genome data are pair wise correlated, segregation of similar genomes into different classes is done and then result is reported in this paper.

Last modified: 2021-03-27 14:28:29