PROTEIN SEQUENCES CLASSIFICATION BY MACHINE LEARNING METHODS
Journal: International Scientific Journal "Internauka" (Vol.1, No. 89)Publication Date: 2020-06-15
Authors : Potip Yuliia; Kysliak Serhii;
Page : 63-71
Keywords : k-nearest-neighbor; logistic regression; decision tree; gradient boosting; random forest;
Abstract
The peculiarity of modern development of computational molecular biology is the exponential accumulation of biological data, which require detailed study and analysis. There is a variety of data mining techniques that can be used to clas‑ sify biological data, but not all of them provide accurate prediction results and require special processing of biological sequenc‑ es. The quality and speed of the protein classification result depends on the number of sequences presented in each class, the processing and transformation of these sequences, and the specificity of the machine learning algorithm. Newest sequencing methods are emerging, increasing the number of proteins, which leads to the problem of annotation. Big data is a big expense of computing power, contributing to the latest decisions on the classification of protein sequences. After all, classified protein is a step towards a narrower comparison of sequences and the solution of one of the most difficult tasks of bioinformatics.
Other Latest Articles
- A REVIEW OF THEORETICAL AND PRACTICAL APPROACHES TO EXTREMOLOGY
- SOME CHANGES IN THE PHYSICAL EDUCATION OF YOUNG PEOPLE TO PREVENT ADVERSE EFFECTS OF ELECTROMAGNETIC RADIATION OF THE RADIO FREQUENCY RANGE
- PALLIATIVE CARE: SOME CONCEPTS, PROBLEMS AND ASPECTS OF THEIR OVERCOMING
- SCIENCE AS A TOOL FOR SOCIAL DEVELOPMENT
- SPECIFIC FEATURES OF TOURISM DEVELOPMENT IN UKRAINE
Last modified: 2021-04-15 18:33:05