ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

INDEPENDENT COMPONENT ANALYSIS FOR INITIAL APPROXIMATION DETERMINATION IN IDENTIFICATION OF ACTIVE MODULES IN BIOLOGICAL GRAPHS

Journal: Scientific and Technical Journal of Information Technologies, Mechanics and Optics (Vol.20, No. 6)

Publication Date:

Authors : ;

Page : 888-892

Keywords : clustering; correlation; independent component analysis; graphs; gene expression;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Subject of Research. The identification of active modules in biological graphs, for example, gene graphs, is one of the important approaches to the interpretation of experimental biological data. One of the approaches for its solution is the application of an algorithm of the joint clustering in network and correlation spaces. The algorithm finds groups of genes that are located simultaneously close in the gene graph and have a high pairwise correlation according to the matrix of gene expression values. The algorithm is iterative and one of its key parameters is the chosen initial approximation, which affects both the run time and the quality of the results. We consider the determination problem of an initial approximation for this algorithm. A procedure based on independent component analysis is proposed for the problem solution. Method. The method of independent component analysis is applied to a centered matrix of expression values at the first step of the proposed procedure for finding of an initial approximation. Then, the genes specific to the component with a given level of statistical significance are identified for each component. The gene groups obtained for all independent components are chosen as the initial approximation. Main Results. The procedure application based on the independent component analysis reduces the number of gene groups in the initial approximation without the loss of accuracy. This fact, in turn, speeds up the running time of the clustering algorithm by an order of magnitude with the quality maintenance of the results. Practical Relevance. Acceleration of the algorithm of the joint clustering in network and correlation spaces without quality loss of the results increases significantly its convenience and simplifies its application for the interpretation of transcriptome data in bioinformatics and computational biology.

Last modified: 2020-12-05 02:31:07