Loading…
Attending this event?
Thursday August 8, 2024 4:03pm - 4:15pm IST
Authors - Harsh Chinchakar, Gaurav Ratnaparkhi, Atharva Tanawade, Saloni Raj Singh, Divyansh Modi, Amaan Shaikh, Vidya Patil
Abstract - Generating informative descriptions for photographs automatically has become an interesting but challenging endeavor. This paper introduces AI Image Captioning using ResNet-50 and LSTM (AICRL), a unified model that uses LSTM with ResNet50 for automatic image captioning. AICRL combines an encoder that uses ResNet50 to build a full image representation with a decoder that uses LSTM and a soft attention mechanism to predict the next phrase while highlighting particular features of the image. AICRL is evaluated using metrics like BLEU. It was trained on the Flickr8k and COCO 2017 dataset individually and optimizes the likelihood of the target description sentence given training photos. The results highlight the effectiveness of AICRL in creating image descriptions. Additionally, the PERSONALITY-CAPTIONS challenge is presented, to generate engaging captions through the integration of configurable characteristics of personality and style. A large dataset is applied to the models created by combining state-of-the-art methods in sentence and image representations. The proposed models demonstrate both robust performance on the unique PERSONALITY-CAPTIONS task and state-of-the-art performance on well-known datasets such as Flickr8k and COCO 2017. Online assessments confirm the best performance of the proposed model close to already existing models.
Paper Presenter
Thursday August 8, 2024 4:03pm - 4:15pm IST
Debate Hotel Vivanta by Taj, Goa, India

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link