Multimodal Attention in Recurrent Neural Networks for Visual Question Answering

Authors

  • Lorena Kodra

  • Elinda Kajo Mece

Keywords:

visual question answering (VQA), multimodal attention mechanism, convolutional neural networks (CNN), recurrent neural networks (RNN), long short-term

Abstract

Visual Question Answering (VQA) is a task for evaluating image scene understanding abilities and shortcomings and also measuring machine intelligence in the visual domain. Given an image and a natural question about the image, the system must ground the question into t

How to Cite

Lorena Kodra, & Elinda Kajo Mece. (2017). Multimodal Attention in Recurrent Neural Networks for Visual Question Answering. Global Journal of Computer Science and Technology, 17(D1), 1–8. Retrieved from https://computerresearch.org/index.php/computer/article/view/1653

Multimodal Attention in Recurrent Neural Networks for Visual Question Answering

Published

2017-01-15