Multimodal Attention in Recurrent Neural Networks for Visual Question Answering
Keywords:
visual question answering (VQA), multimodal attention mechanism, convolutional neural networks (CNN), recurrent neural networks (RNN), long short-term
Abstract
Visual Question Answering (VQA) is a task for evaluating image scene understanding abilities and shortcomings and also measuring machine intelligence in the visual domain. Given an image and a natural question about the image, the system must ground the question into t
Downloads
- Article PDF
- TEI XML Kaleidoscope (download in zip)* (Beta by AI)
- Lens* NISO JATS XML (Beta by AI)
- HTML Kaleidoscope* (Beta by AI)
- DBK XML Kaleidoscope (download in zip)* (Beta by AI)
- LaTeX pdf Kaleidoscope* (Beta by AI)
- EPUB Kaleidoscope* (Beta by AI)
- MD Kaleidoscope* (Beta by AI)
- FO Kaleidoscope* (Beta by AI)
- BIB Kaleidoscope* (Beta by AI)
- LaTeX Kaleidoscope* (Beta by AI)
How to Cite
Published
2017-01-15
Issue
Section
License
Copyright (c) 2017 Authors and Global Journals Private Limited
This work is licensed under a Creative Commons Attribution 4.0 International License.