Visual Question Answering Literature Survey

Results

(as self-published by authors- not verified)

Results below are for testdev-2015, except the final column which is for test-standard

Method	All	Y/N	Other	Num	Test-Std[All]
~~~~~~~~~~~~~~	~~~~~~~~~~~~~	~~~~~~~~~~~~	~~~~~~~~~	~~~~~~~~~~	~~~~~~~~
Image	28.1	64.0	3.8	0.4	-
Question	48.1	75.7	27.1	36.7	-
Q+I	52.6	75.6	37.4	33.7	-
LSTM Q+I	53.7	78.9	36.4	35.2	54.1
[16CMV]	52.6	78.3	35.9	34.4	-
[09AMA]	55.7	79.2	40.1	36.1	56.0
[13BOW]	55.7	76.5	42.6	35.0	55.9
[07DPP]	57.2	80.7	41.7	37.2	57.4
[17LCN]	57.9	80.5	43.1	37.4	58.0
[11AAA]	57.9	80.8	43.2	37.3	58.2
[12SAN]	58.7	79.3	46.1	36.6	58.9
[15DMN]	60.3	80.5	48.3	36.8	60.4
OUR	60.4	81.5	47.6	37.2	60.7

[01VQA] VQA: Visual Question Answering
[02EMD] Exploring Models and Data for Image Question Answering
[03LAQ] Learning to Answer Questions From Image Using Convolutional Neural Network
[04DCQ] Deep Compositional Question Answering with Neural Module Networks
[05ABC] An attention based convolutional neural network for visual question answering
[06ATM] Are you talking to a machine? datasetand methods for multilingual image question answering
[07DPP] Image question answering using convolutional neural networkwith dynamic parameter prediction
[08WTL] Where to look: Focus regions for visual question answering
[09AMA] Ask me anything: Free-form visual question answering based on knowledge from external sources
[10V7W] Visual7W: Grounded Question Answering in Images
[11AAA] Ask, Attend and Answer: Exploring question-guided spatial attention for visual question answering
[12SAN] Stacked attention networks for image questionanswering
[13BOW] Simple Baseline for Visual Question Answering
[14ICV] Image Captioning & Visual Question Answering Based on Attributes & External Knowledge
[15DMN] Dynamic Memory Networks for Visual and Textual Question Answering
[16CMV] Compositional Memory for Visual Question Answering
[17LCN] Learning to Compose Neural Networks for Question Answering