Revision as of 00:04, 21 November 2017

Introduction

Visual Question Answering (VQA) is a recent problem in computer vision and natural language processing that has garnered a large amount of interest from the deep learning, computer vision, and natural language processing communities. In VQA, an algorithm needs to answer text-based questions about images in natural language as illustrated in Figure 1.

Figure 1: Figure illustrates a VQA system; whereby AI System takes an image and a text-based visual question about the image as input and outputs the answer for the visual question in natural language

.

@@ Line 1: / Line 1: @@
+__TOC__
 = Introduction =
 Visual Question Answering (VQA) is a recent problem in computer vision and
@@ Line 4: / Line 6: @@
 the deep learning, computer vision, and natural language processing communities.
 In VQA, an algorithm needs to answer text-based questions about images in
-natural language as illustrated in Figure <xr="fig:vqa-overview"/>.
+natural language as illustrated in Figure 1.
-<figure id="fig:vqa-overview">
+[[File:vqa-overview.png|thumb|800px|center|Figure 1: Figure illustrates a VQA system; whereby AI System takes an image and a text-based visual question about the image as input and outputs the answer for the visual question in natural language]].
-  [[File:vqa-overview.png|thumb|800px|center|Figure 1: Figure illustrates a VQA system; whereby AI System takes an image and a text-based visual question about the image as input and outputs the answer for the visual question in natural language]].
-</figure>

Hierarchical Question-Image Co-Attention for Visual Question Answering: Difference between revisions

Revision as of 00:04, 21 November 2017

Contents

Introduction

Navigation menu

Hierarchical Question-Image Co-Attention for Visual Question Answering: Difference between revisions

Revision as of 00:04, 21 November 2017

Introduction

Navigation menu

Search