Loading…
Attending this event?
Thursday August 8, 2024 12:15pm - 2:15pm IST
Authors - Sneha Jakati, Sukhadevi Kannal, Charvi Kolloori, Satish chikkamath, Nirmala S R, Suneeta V Budihal
Abstract - Image captioning is the process of identifying an image’s content and adding a pertinent caption to it. Each image has a wealth of information that humans can quickly pick up. It is challenging for a machine to mimic the human ability to comprehend visual information and generate descriptive language. This work aims to bring Aqueduct between computer vision and natural language processing, which should be able to provide an accurate and relevant caption for a given image. The present work utilizes a technical approach using the ViT (Vision Transformer)-GPT-2 (Generative Pre-trained Transformer 2) model and the encoder-decoder design employing CNN (Convolutional Neural Network) LSTM (Long Short-Term Memory). The dataset used is Flicker8k, consisting of 8000 images. There are five distinct captions for each image, offering a variety of explanations for a single picture, wherein it selects the most relevant single caption for a given image.
Paper Presenter
Thursday August 8, 2024 12:15pm - 2:15pm IST
Virtual Room D Goa, India

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link