aprakash[at]brandeis.edu   iamaaditya    Public PGP       Address    @aaditya_prakash    Quora

Table of Contents     Biography     Highlights     Research     Projects     Publications


I am a PhD Student at Brandeis University, Boston. My research focus is in application of deep learning in problems of vision and language. Before joining grad school I was a Senior Systems Engineer at Infosys Limited. I am from Kathmandu, Nepal. My name means “Sun”, and it is pronounced as “aaa–Dee–ti–ya”. Other than research, I love rock climbing, running, playing chess and listening to classical Indian songs.


  • Gave four lecture series on Deep Learning, Convolutional Neural Networks, Recurrent Neural Networks and Object localization/detection. [Slides]

  • Paper on image compression using CNN accepted to Data Compression Conference. [Code] [PDF]

  • Paper on Condensed Memory Networks accepted to AAAI 2017. [PDF] [Slides] [Poster]

  • Paper on Neural Paraphrase Generation accepted to COLING 2016. [PDF] [Poster]

  • Won Honrable Mention Prize for Visual Question Answering Challenge. [Video] [Slides] [Poster]

  • Accepted to Deep Learning Summer School at University of Montrel. [25% acceptance rate]

  • Started summer internship at AI Labs at Philips Research.

  • Paper on ‘Highway Networks for Visual Question Answering’ accepted for CVPR VQA workshop. [PDF]


- Condensed Memory Networks

  • Problem: Improve the memory networks for large scale NLP tasks

  • Our method: Add an alternate memory state which is condensed with previous hop values exponentially fewer slots.

  • [PDF] [Slides]

  • Blog post describing the project coming soon.

- Semantic image compression using CNN

  • Problem: Add semantic knowledge to image compression Semantic Image Compression

  • Our method: Use CNN to generate a map that covers all the ‘semantic objects’ and weighs them based on importance. Use variable scaling JPEG to encode the information. For more details :

  • [Code] [PDF]

  • Supervisor - Prof. James Storer

- Neural Paraphrase generation using Residual Stacked LSTM

  • Problem: Generate paraphrases using a ‘neural’ model.

  • Our method: We take inspiration from ResNet and apply the same techniques to LSTM. We believe this helpes to maintain the semantics of the paraphrases.

  • Work done during internship at Philips Research.

  • Shown below are samples using our method on three different datasets.

- Visual Question Answering

  • Problem: Given a color image of arbitrary size and question of arbitrary length, come up with the most reasonable answer (ground truth obtained from 10 amazon-turks responses)

  • Our approach: Use of Highway networks to attain implicit attention and learn deeper feature representations. See this for more details

  • Supervisor - Prof. James Storer

- Transfer learning

  • We are investigating ideas for improving content based image retrieval through transfer learning.

  • We are currently exploring ways to retrieve MIT Places (scene recognition database) using deep residual image models like ResNet - 2015 ImageNet challenge winner. Our goal is to do ablation study on the 152 layers of ResNet for image retrieval task.

  • Literature survey for compressing deep convolutional neural networks using vector quantization.

Pre-Grad Research

- Computational Fact Checking [Summer 2014]

  • We investigated applications of computational fact checking on a database with retrospections.

  • Implemented a fact checking application for database with weekly Music Billboards.

  • One page summary     —–       Detailed Report

  • Supervisor - Prof. Liuba Shrira

- Self Organizing maps for large unstructured data [2013] - Infosys Labs

  • Formulated and designed a novel way to visualize self-organizing maps for unstructured big data.

  • Compared various forms of visual representation along with radar graphs and default visualization of self-organizing maps from most of the common packages in R and Matlab.

  • See Publication section for Abstract and PDF of the published work.

- Distributed Simulated Annealing [2012] - Infosys Labs

  • Studied industry scaled distributed simulated annealing and issues that arise when dealing with large scale optimization.

  • Presented fault tolerance techniques for such a system designed for MapReduce infrastructure running on Apache Hadoop.

  • See Publication section for Abstract and PDF of the published work.



In Review

  • Garber, Solomon et al. “Static Visual Lecture Summary using Local Intensity Correlation”.


Prakash, Aaditya, et al. “Semantic Perceptual Image Compression using Deep Convolution Networks.” DCC (2017).

Abstract It has long been considered a significant problem to improve the visual quality of lossy image and video compression. Recent advances in computing power together with the availability of large training data sets has increased interest in the application of deep learning cnns to address image recognition and image processing tasks. Here, we present a powerful cnn tailored to the specific task of semantic image understanding to achieve higher visual quality in lossy compression. A modest increase in complexity is incorporated to the encoder which allows a standard, off-the-shelf jpeg decoder to be used. While jpeg encoding may be optimized for generic images, the process is ultimately unaware of the specific content of the image to be compressed. Our technique makes jpeg content-aware by designing and training a model to identify multiple semantic regions in a given image. Unlike object detection techniques, our model does not require labeling of object positions and is able to identify objects in a single pass. We present a new cnn architecture directed specifically to image compression, which generates a map that highlights semantically-salient regions so that they can be encoded at higher quality as compared to background regions. By adding a complete set of features for every class, and then taking a threshold over the sum of all feature activations, we generate a map that highlights semantically-salient regions so that they can be encoded at a better quality compared to background regions. Experiments are presented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset, in which our algorithm achieves higher visual quality for the same compressed size.

[PDF] [Code]

Prakash, Aaditya, et al. “Condensed Memory Networks for Clinical Diagnostic Inferencing.” AAAI (2017).

Abstract Diagnosis of a clinical condition is a challenging task, which often requires significant medical investigation. Previous work related to diagnostic inferencing problems mostly consider multivariate observational data (e.g. physiological signals , lab tests etc.). In contrast, we explore the problem using free-text medical notes recorded in an electronic health record (EHR). Complex tasks like these can benefit from structured knowledge bases, but those are not scalable. We instead exploit raw text from Wikipedia as a knowledge source. Memory networks have been demonstrated to be effective in tasks which require comprehension of free-form text. They use the final iteration of the learned representation to predict probable classes. We introduce condensed memory neural networks (C-MemNNs), a novel model with iterative condensation of memory representations that preserves the hierarchy of features in the memory. Experiments on the MIMIC-III dataset show that the proposed model outperforms other variants of memory networks to predict the most probable diagnoses given a complex clinical scenario.

[PDF] [Slides] [Poster]

Prakash, Aaditya, et al. “Neural Paraphrase Generation with Stacked Residual LSTM Networks.” COLING (2016).

Abstract n this paper, we propose a novel neural approach for paraphrase generation. Conventional paraphrase generation methods either leverage handwritten rules and thesauri-based alignments, or use statistical machine learning principles. To the best of our knowledge, this work is the first to explore deep learning models for paraphrase generation. Our primary contribution is a stacked residual LSTM network, where we add residual connections between LSTM layers. This allows for efficient training of deep LSTMs. We experiment with our model and other state-of-the-art deep learning models on three different datasets: PPDB, WikiAnswers and MSCOCO. Evaluation results demonstrate that our model outperforms sequence to sequence, attention-based and bi-directional LSTM models on BLEU, METEOR, TER and an embedding-based sentence similarity metric.

[PDF] [Poster]

Prakash, A. & Storer, J (2016) Highway Networks for Visual Question Answering, CVPR Workshop (VQA).

Abstract We propose a version of highway network designed for the task of Visual Question Answering. We take inspiration from recent success of Residual Layer Network and Highway Network in learning deep representation of images and fine grained localization of objects. We propose variation in gating mechanism to allow incorporation of word embedding in the information highway. The gate parameters are influenced by the words in the question, which steers the network towards localized feature learning. This achieves the same effect as soft attention via recurrence but allows for faster training using optimized feed-forward techniques. We are able to obtain state-of-the-art1 results on VQA dataset for Open Ended and Multiple Choice tasks with current model.

[PDF] [Slides] [Poster]

Pre grad school

  • Prakash, A. (2013). Reconstructing Self Organizing Maps as Spider Graphs for better visual interpretation of large unstructured datasets. Infosys Lab Briefings, Infosys. Vol 11(1). Jan 2013

    [Abstract] [Full-pdf] [INFY] [slides]
  • Prakash, A. (2012). Measures of Fault Tolerance in Distributed Simulated Annealing. Proceedings of International Conference on Perspective of Computer Confluence with Sciences. Vol 1 pp 111-114.

    [Abstract] [Full-pdf] [arXiv] [slides]
  • Prakash, A., & Jha, R. K. (2012). New Interface Protocol to Connect Multiple Bank Networks from a Single Outlet. International Journal of Computer Applications, NY, USA Vol. 55(1) pp 1-9.

    [Abstract] [Full-pdf] [IJCA] [slides]

Quora - Wikipedia - Google+ - Twitter - Linkedin - Academia.edu - SlideShare - GitHub - ResearchGate

Thank you! (づ。◕‿‿◕。)づ