ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

PROTEIN SEQUENCES CLASSIFICATION BY MACHINE LEARNING METHODS

Journal: International Scientific Journal "Internauka" (Vol.1, No. 89)

Publication Date:

Authors : ; ;

Page : 63-71

Keywords : k-nearest-­neighbor; logistic regression; decision tree; gradient boosting; random forest;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The peculiarity of modern development of computational molecular biology is the exponential accumulation of biological data, which require detailed study and analysis. There is a variety of data mining techniques that can be used to clas‑ sify biological data, but not all of them provide accurate prediction results and require special processing of biological sequenc‑ es. The quality and speed of the protein classification result depends on the number of sequences presented in each class, the processing and transformation of these sequences, and the specificity of the machine learning algorithm. Newest sequencing methods are emerging, increasing the number of proteins, which leads to the problem of annotation. Big data is a big expense of computing power, contributing to the latest decisions on the classification of protein sequences. After all, classified protein is a step towards a narrower comparison of sequences and the solution of one of the most difficult tasks of bioinformatics.

Last modified: 2021-04-15 18:33:05