DOCUMENT SUMMARIZATION USING SENTENCE BASED TOPIC MODELING AND CLUSTERING
Journal: International Journal of Advanced Research (Vol.6, No. 5)Publication Date: 2018-05-01
Authors : Augustine George; Hanumanthappa.;
Page : 285-291
Keywords : Term Frequency Natural Language Processing Text Summarization Structural Topic Modeling (STM) Pre-processing Tokenization Topic Modeling.;
Abstract
In recent years, the practical application of automatic document summarization has become popular and numerous papers published based on the topic. There are many approaches to identify the significant portion of each document. Topic representation and modelling is an intermediate representation of the text that captures the topics discussed in the input and aids the automatic summarization. The significance of sentences decided based on the representations of topics in the input document. This article attempts to provide a comprehensive summary that includes sentence extraction, tokenization on the extracted sentences. Sentence based Structural Topic Modeling (STM) is used to determine important content for each domain in the integrated document and sentences are grouped using k-means clustering under each topic. Further Text Summarization of sentences under each topic achieved using its Term Frequency of each sentence. Finally, the sentences are arranged based on its Lexical Ranking score in the summarized text.
Other Latest Articles
- LAPAROSCOPY VERSUS LAPAROTOMY IN EVALUATION OF PENETRATING ABDOMINAL INJURIES
- IN SCHOOL FEMALE ADOLESCENTS ATTITUDE TOWARDS SEXUALITY EDUCATION IN EBONYI STATE, NIGERIA
- EFFECT OF INTER-MANUAL TRAINING OF HAND ON SKILL TRANSFER IN CHILDHOOD STROKE
- NOTARYS RESPONSIBILITY ON THE USE OF POPULATION DATA
- COGNITIVE DRILL THERAPY IN MENTAL CONTAMINATION: A CASE STUDY
Last modified: 2018-06-22 17:06:01