ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login


Journal: International Journal of Advanced Research (Vol.10, No. 12)

Publication Date:

Authors : ; ;

Page : 114-125

Keywords : Information Extraction Relation Extraction Supervised Machine Learning SVM and CRF;

Source : Downloadexternal Find it from : Google Scholarexternal


This research work primarily focused on the automatic relation extraction between entities for Amharic text using supervised machine learning approach.The Walta Information Centre online archive resources were used to create the studys own corpus, which consisted of 2000 sentences and a reasonable quantity of 30,466 words or tokens. The proposed solution has four processes namely preprocessing, Text labeling, feature extraction and feature selection and Recognition. The tokenization and POS are used as preprocessing. After the text is tokenized and giving POS for each of tokens the next step is text labeling system. For text labeling mechanism BIO scheme is used. The tag features are selected for building the model. The Tag feature consists of name entity type and relation type. The name entity type features are represented by Location (LOC), Organization (ORG) and Person (PER) and the relation type features are identified every word which existed between two entities for instance between location-location relation type or location-organization relation type and all the corresponding entities that are appeared it. Vectorizations are done using DictVectorizer and word2features. Support vector machines and conditional random field machine learning are used to for recognizing the entity relation between Amharic texts. SVM with SGD achieved the weighted precision of 49%, recall 10% and f1-score 13% are scored. SVM with Multinomial Naive Bayes Classifier Algorithm achieve precision of 61%, recall 41% and f1-score 48%.SVM with Passive Aggressive classifier achieved weighted average precision of 55%, recall 19% and f1-score 27%. CRF algorithm achieved precision of 87%, recall 87% and f1-score 86%. The CRF model outperform compared with other SVM algorithms.

Last modified: 2022-12-28 19:18:36