Research
My research interests are in computer vision, machine learning and optimization, and image processing. The majority of my research is about image and video understanding.
|
|
Learning Augmentation Policy Schedules for Unsuperivsed Depth Estimation.
Aman Raj
UC San Diego Electronic Theses and Dissertations, 2020
thesis / code
My MS thesis that proposes a novel approach to augment data for unsupervised depth estimation. Our method learn data augmentation strategies from data itself.
|
|
SUW-Learn: Joint Supervised, Unsupervised, Weakly Supervised Deep Learning for Monocular Depth Estimation.
Haoyu Ren, Aman Raj, Mostafa El-Khamy and Jungwon Lee
CVPR, 2020
video / supplement
A framework for deep-learning with joint supervised learning (S), unsupervised learning (U), and weakly-supervised learning (W). We deploy SUW- Learn for deep learning of the monocular depth from im- ages and video sequences.
|
|
SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception.
Yue Meng, Yongxi Lu, Aman Raj, Samuel Sunarjo, Rui Guo, Tara Javidi, Gaurav Bansal, Dinesh Bharadia
CVPR, 2019
project / supplement
SIGNet integrates semantic information to make depth and flow predictions consistent with objects and robust to low lighting conditions. SIGNet is shown to improve upon the state-of-the-art unsupervised learning for depth prediction.
|
|
A Holistic Framework for Addressing the World using Machine Learning.
Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddla, Sanyam Garg, Barrett Doo, Ramesh Raskar
CVPR, 2018
project
We propose an automatic generative algorithm to create street addresses from satellite imagery. Our addressing scheme is coherent with the street topology, linear and hierarchical to follow human perception, and universal to be used as a unified geocoding system.
|
|
Generative Street Addresses from Satellite Imagery.
Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddla, Sanyam Garg, Barrett Doo, Ramesh Raskar
ISPRS, 2018   (The Jack Dangermond Award – Best Paper)
project / code / talk
Our algorithm starts with extracting roads from satellite imagery by utilizing deep learning. Then, it uniquely labels the regions, roads, and structures using graph and proximity-based algorithms.
|
|
Robocodes: Towards Generative Street Addresses from Satellite Imagery.
Ilke Demir, Forest Hughes, Aman Raj, Kleovoulos Tsourides, Divyaa Ravichandran, Suryanarayana Murthy, Kaunil Dhruv, Sanyam Garg, Jatin Malhotra, Barrett Doo, Grace Kermani, Ramesh Raskar
CVPR, 2017   (Best Paper Award)
project / code / blog / news
We describe our automatic generative algorithm to create street addresses (Robocodes) from satellite images by learning and labeling regions, roads, and blocks.
|
|
FPGA Accelerated Abandoned Object Detection.
Rajesh Rohilla, Aman Raj, Saransh Kejriwal, Rajiv Kapoor
ICCTICT, IEEE 2016
slides
We propose a hardware implementation of abandoned object detection algorithm on FPGA aimed for making a custom chip that can do real-time inference on live video feed.
|
|
Multi-Scale Convolutional Architecture for Semantic Segmentation.
Aman Raj, Daniel Maturana, Sebastian Scherer
RI Technical Reports, CMU 2015
project / slides
This work exploits the geocentric encoding of a depth image and uses a multi-scale deep convolutional neural network architecture that captures high and low- level features of a scene to generate rich semantic labels.
|
|
Digitization of Historic Inscription Images using Cumulants based Simultaneous Blind Source Extraction.
N. Jayanthi, Ayush Tomar, Aman Raj, S. Indu, Santanu Chaudhury
ICVGIP 2014
Proposed technique provides a suitable method to separate the text layer from the historic inscription images by considering the problem as blind source separation which aims to calculate the independent components from a linear mixture of source signals, by maximizing a contrast function based on higher order cumulants.
|
|
Enhancement and Retrieval of Historic Inscription Images.
S. Indu, Ayush Tomar, Aman Raj, Santanu Chaudhury
ACCV 2014
Binarization of inscription images using the proposed cumulants based Blind Source Extraction(BSE) method, and store them in a digital library
with their corresponding historic information which can be retrieved later using image retrieval algorithms such as BoW.
|
|
Teaching Assistant, CSE 12 - Spring 2019
|
|
Comic Polyglot
CMU Winter School, 2014   (Best Project Award)
poster
Comic Polyglot-A system that identifies the text regions in comic strips like Manga and subsequently translates it’s Japanese text into English using an OCR engine while maintaining the syntax. It is aimed to help English-speaking manga comic readers.
|
|
Lunabot
NASA's Lunabotics Mining Competition, 2013
paper / outreach / gallery
Aaravya Lunabot - DTU’s official entry into NASA Lunabotics Mining Competition 2013. The challenge required student teams to design and build a mining robot that can traverse the simulated lunar chaotic terrain, excavate lunar regolith and deposit the regolith into a collector bin within ten minutes.
|
|