ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Image Caption Generator Using Convolutional Neural Network Algorithm

Journal: International Journal of Science and Research (IJSR) (Vol.11, No. 8)

Publication Date:

Authors : ;

Page : 702-707

Keywords : Convolutional Neural Network; Long Short-Term Memory (LSTM); Recurrent Neural Network; TensorFlow; Keras; NumPy;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

It is a very difficult challenge to automatically describe an image using a sentence from any natural language, such as English. It necessitates knowledge of both natural language processing and picture processing. The fusion of computer vision and natural language processing has received a lot of interest recently thanks to the advent of deep learning. This field is exemplified by image captioning, which teaches a computer to understand an image's visual information using one or more phrases. In addition to the ability to recognize the item and the scene, high-level image semantics also needs the ability to analyze the state, the properties, and the relationship between these things. Despite the fact that image captioning is a challenging and intricate endeavor, numerous academics have made substantial advancements. In artificial intelligence (AI), computer vision and natural language processing are used to automatically create an image's contents (Natural Language Processing). The regenerative neuronal model is developed. It is dependent on machine translation and computer vision. Using this technique, natural phrases are produced that finally explain the image. Convolutional neural networks (CNN)and recurrent neural networks (RNN) are also components of this architecture. RNN is utilized for phrase creation, while CNN is used to extract features from images. The model has been taught to produce captions that, when given an input image, almost exactly describe the image. On various datasets, the model's precision and the fluency or command of the language it learns from visual descriptions are examined. These tests demonstrate that the model frequently provides precise descriptions for an input image.

Last modified: 2022-09-07 15:21:04