Name: Neural Network-Powered Image Captioning: Generating Descriptive Text for Visual Content
Start: 2024-08-08T12:15:00+0530
End: 2024-08-08T14:15:00+0530

Thursday August 8, 2024 12:15pm - 2:15pm IST

Virtual Room D

Authors - Sneha Jakati, Sukhadevi Kannal, Charvi Kolloori, Satish chikkamath, Nirmala S R, Suneeta V Budihal
Abstract - Image captioning is the process of identifying an image’s content and adding a pertinent caption to it. Each image has a wealth of information that humans can quickly pick up. It is challenging for a machine to mimic the human ability to comprehend visual information and generate descriptive language. This work aims to bring Aqueduct between computer vision and natural language processing, which should be able to provide an accurate and relevant caption for a given image. The present work utilizes a technical approach using the ViT (Vision Transformer)-GPT-2 (Generative Pre-trained Transformer 2) model and the encoder-decoder design employing CNN (Convolutional Neural Network) LSTM (Long Short-Term Memory). The dataset used is Flicker8k, consisting of 8000 images. There are five distinct captions for each image, offering a variety of explanations for a single picture, wherein it selects the most relevant single caption for a given image.

Paper Presenter

Sneha Jakati

India

Thursday August 8, 2024 12:15pm - 2:15pm IST
Virtual Room D Goa, India

Virtual Room 5D, Virtual Room D

Host Organization Global Knowledge Research Foundation

9th International Conference on ICT for Sustainable Development

Sneha Jakati

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!